mbox series

[0/2] fix RISC-V zcmp popretz [PR113715]

Message ID 20240605015024.7845-1-gaofei@eswincomputing.com
Headers show
Series fix RISC-V zcmp popretz [PR113715] | expand

Message

Fei Gao June 5, 2024, 1:50 a.m. UTC
The 1st patch adds a hook to allow post processing after epilogue inserted.
The 2nd one implement the RISC-V hook to solve PR113715.

Fei Gao (2):
  target hooks: allow post processing after epilogue inserted.
  [RISC-V]: fix zcmp popretz [PR113715].

 gcc/config/riscv/riscv.cc                   | 191 ++++++++++++++------
 gcc/doc/tm.texi                             |   5 +
 gcc/doc/tm.texi.in                          |   2 +
 gcc/function.cc                             |   2 +
 gcc/hooks.cc                                |   7 +
 gcc/hooks.h                                 |   1 +
 gcc/target.def                              |   8 +
 gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c |  56 ++++++
 8 files changed, 219 insertions(+), 53 deletions(-)

Comments

Kito Cheng June 5, 2024, 6:36 a.m. UTC | #1
Thanks for fixing this issue, and I am wondering doest it possible to
fix that without introduce target hook? I ask that because...GCC 14
also has this bug, but I am not sure it's OK to introduce new target
hook for release branch? or would you suggest we just revert patch to
fix that on GCC 14?

On Wed, Jun 5, 2024 at 9:50 AM Fei Gao <gaofei@eswincomputing.com> wrote:
>
> The 1st patch adds a hook to allow post processing after epilogue inserted.
> The 2nd one implement the RISC-V hook to solve PR113715.
>
> Fei Gao (2):
>   target hooks: allow post processing after epilogue inserted.
>   [RISC-V]: fix zcmp popretz [PR113715].
>
>  gcc/config/riscv/riscv.cc                   | 191 ++++++++++++++------
>  gcc/doc/tm.texi                             |   5 +
>  gcc/doc/tm.texi.in                          |   2 +
>  gcc/function.cc                             |   2 +
>  gcc/hooks.cc                                |   7 +
>  gcc/hooks.h                                 |   1 +
>  gcc/target.def                              |   8 +
>  gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c |  56 ++++++
>  8 files changed, 219 insertions(+), 53 deletions(-)
>
> --
> 2.17.1
>
Fei Gao June 5, 2024, 7:47 a.m. UTC | #2
On 2024-06-05 14:36  Kito Cheng <kito.cheng@gmail.com> wrote:
>
>Thanks for fixing this issue, and I am wondering doest it possible to
>fix that without introduce target hook? I ask that because...GCC 14
>also has this bug, but I am not sure it's OK to introduce new target
>hook for release branch? or would you suggest we just revert patch to
>fix that on GCC 14? 

If hook is unacceptable in GCC14, I suggest to revert on GCC 14 the following commit.
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b27d323a368033f0b37e93c57a57a35fd9997864

I started fixing this issue by adding changes in mach pass but abandoned it
due to the following reasons:
1. more codes to detect location of epilogue in the whole insn list.
2. due to impact by scheduling pass, clear a0 and use a0 insns get reordered, resulting in more
    codes.
3. data flow analysis is needed, but insn does't have bb info any more, so rescan actually does
    nothing, which I guess there's some hidden issue in riscv_remove_unneeded_save_restore_calls
    using dfa.

So I came up this hook based patch in prologue and epilogue pass to make the optimization
happen as earlier as possible. It ends up with simplicity and clear logic.

BR
Fei

>
>On Wed, Jun 5, 2024 at 9:50 AM Fei Gao <gaofei@eswincomputing.com> wrote:
>>
>> The 1st patch adds a hook to allow post processing after epilogue inserted.
>> The 2nd one implement the RISC-V hook to solve PR113715.
>>
>> Fei Gao (2):
>>   target hooks: allow post processing after epilogue inserted.
>>   [RISC-V]: fix zcmp popretz [PR113715].
>>
>>  gcc/config/riscv/riscv.cc                   | 191 ++++++++++++++------
>>  gcc/doc/tm.texi                             |   5 +
>>  gcc/doc/tm.texi.in                          |   2 +
>>  gcc/function.cc                             |   2 +
>>  gcc/hooks.cc                                |   7 +
>>  gcc/hooks.h                                 |   1 +
>>  gcc/target.def                              |   8 +
>>  gcc/testsuite/gcc.target/riscv/rv32i_zcmp.c |  56 ++++++
>>  8 files changed, 219 insertions(+), 53 deletions(-)
>>
>> --
>> 2.17.1
>>
Jeff Law June 5, 2024, 1:51 p.m. UTC | #3
On 6/5/24 12:36 AM, Kito Cheng wrote:
> Thanks for fixing this issue, and I am wondering doest it possible to
> fix that without introduce target hook? I ask that because...GCC 14
> also has this bug, but I am not sure it's OK to introduce new target
> hook for release branch? or would you suggest we just revert patch to
> fix that on GCC 14?
The question I would ask is why is the target hook needed.  ie, what 
problem needs to be solved and how does a new target hook help.



Jeff
Jeff Law June 5, 2024, 1:58 p.m. UTC | #4
On 6/5/24 1:47 AM, Fei Gao wrote:
> 
> On 2024-06-05 14:36  Kito Cheng <kito.cheng@gmail.com> wrote:
>>
>> Thanks for fixing this issue, and I am wondering doest it possible to
>> fix that without introduce target hook? I ask that because...GCC 14
>> also has this bug, but I am not sure it's OK to introduce new target
>> hook for release branch? or would you suggest we just revert patch to
>> fix that on GCC 14?
> 
> If hook is unacceptable in GCC14, I suggest to revert on GCC 14 the following commit.
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b27d323a368033f0b37e93c57a57a35fd9997864
> 
> I started fixing this issue by adding changes in mach pass but abandoned it
> due to the following reasons:
> 1. more codes to detect location of epilogue in the whole insn list.
> 2. due to impact by scheduling pass, clear a0 and use a0 insns get reordered, resulting in more
>      codes.
> 3. data flow analysis is needed, but insn does't have bb info any more, so rescan actually does
>      nothing, which I guess there's some hidden issue in riscv_remove_unneeded_save_restore_calls
>      using dfa.
> 
> So I came up this hook based patch in prologue and epilogue pass to make the optimization
> happen as earlier as possible. It ends up with simplicity and clear logic.
But let's back up and get a good explanation of what the problem is. 
Based on patch 2/2 it looks like we have lost an assignment to the 
return register.

To someone not familiar with this code, it sounds to me like we've made 
a mistake earlier and we're now defining a hook that lets us go back and 
fix that earlier mistake.   I'm probably wrong, but so far that's what 
it sounds like.

So let's get a good explanation of the problem and perhaps we'll find a 
better way to solve it.

jeff
Fei Gao June 6, 2024, 2:42 a.m. UTC | #5
On 2024-06-05 21:58  Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
>On 6/5/24 1:47 AM, Fei Gao wrote:
>>
>> On 2024-06-05 14:36  Kito Cheng <kito.cheng@gmail.com> wrote:
>>>
>>> Thanks for fixing this issue, and I am wondering doest it possible to
>>> fix that without introduce target hook? I ask that because...GCC 14
>>> also has this bug, but I am not sure it's OK to introduce new target
>>> hook for release branch? or would you suggest we just revert patch to
>>> fix that on GCC 14?
>>
>> If hook is unacceptable in GCC14, I suggest to revert on GCC 14 the following commit.
>> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b27d323a368033f0b37e93c57a57a35fd9997864
>>
>> I started fixing this issue by adding changes in mach pass but abandoned it
>> due to the following reasons:
>> 1. more codes to detect location of epilogue in the whole insn list.
>> 2. due to impact by scheduling pass, clear a0 and use a0 insns get reordered, resulting in more
>>      codes.
>> 3. data flow analysis is needed, but insn does't have bb info any more, so rescan actually does
>>      nothing, which I guess there's some hidden issue in riscv_remove_unneeded_save_restore_calls
>>      using dfa.
>>
>> So I came up this hook based patch in prologue and epilogue pass to make the optimization
>> happen as earlier as possible. It ends up with simplicity and clear logic.
>But let's back up and get a good explanation of what the problem is.
>Based on patch 2/2 it looks like we have lost an assignment to the
>return register.
>
>To someone not familiar with this code, it sounds to me like we've made
>a mistake earlier and we're now defining a hook that lets us go back and
>fix that earlier mistake.   I'm probably wrong, but so far that's what
>it sounds like. 
Hi Jeff

You're right. Let me rephrase  patch 2/2 with more details. Search /* feigao to location the point I'm
tring to explain.

code snippets from gcc/function.cc
void
thread_prologue_and_epilogue_insns (void)
{
...
  /*feigao: 
        targetm.gen_epilogue () is called here to generate epilogue sequence.
	https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b27d323a368033f0b37e93c57a57a35fd9997864
	Commit above tries in targetm.gen_epilogue () to detect if
	there's li	a0,0 insn at the end of insn chain, if so, cm.popret
	is replaced by cm.popretz and li	a0,0 insn is deleted.
	
	issue of the commit above:
	Insertion of the generated epilogue sequence
	into the insn chain doesn't happen at this moment.
	If later shrink-wrap decides NOT to insert the epilogue sequence at the end
	of insn chain, then the li	a0,0 insn has already been mistakeny removed.  */
  rtx_insn *epilogue_seq = make_epilogue_seq ();

  /* Try to perform a kind of shrink-wrapping, making sure the
     prologue/epilogue is emitted only around those parts of the
     function that require it.  */
  /* feigao:
     Make a simple_return for those exits that run without prologue and
     force nonfallthru by e->flags &= ~EDGE_FALLTHRU	 */
  try_shrink_wrapping (&entry_edge, prologue_seq);

...

  edge exit_fallthru_edge = find_fallthru_edge (EXIT_BLOCK_PTR_FOR_FN (cfun)->preds);

  /* feigao: fasle here in simple_return case, so no insertion of epilogue,
     but the li	a0,0 insn has already been mistakeny removed.  */
  if (exit_fallthru_edge) 
    {
      if (epilogue_seq)
	{
	  insert_insn_on_edge (epilogue_seq, exit_fallthru_edge);
	  commit_edge_insertions ();

	  /* The epilogue insns we inserted may cause the exit edge to no longer
	     be fallthru.  */
	  FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds)
	    {
	      if (((e->flags & EDGE_FALLTHRU) != 0)
		  && returnjump_p (BB_END (e->src)))
		e->flags &= ~EDGE_FALLTHRU;
	    }

	  find_sub_basic_blocks (BLOCK_FOR_INSN (epilogue_seq));
	}
...

  if (epilogue_seq)
    {
      rtx_insn *insn, *next;

      /* Similarly, move any line notes that appear after the epilogue.
         There is no need, however, to be quite so anal about the existence
	 of such a note.  Also possibly move
	 NOTE_INSN_FUNCTION_BEG notes, as those can be relevant for debug
	 info generation.  */
      for (insn = epilogue_seq; insn; insn = next)
	{
	  next = NEXT_INSN (insn);
	  if (NOTE_P (insn)
	      && (NOTE_KIND (insn) == NOTE_INSN_FUNCTION_BEG))
	    reorder_insns (insn, insn, PREV_INSN (epilogue_seq));
	}
    }

  /* feigao: new hook */
  targetm.post_epilogue_proc (epilogue_seq); 
...

BR, 
Fei 

>
>So let's get a good explanation of the problem and perhaps we'll find a
>better way to solve it.
>
>jeff
>
Jeff Law June 8, 2024, 8:36 p.m. UTC | #6
On 6/5/24 8:42 PM, Fei Gao wrote:

>> But let's back up and get a good explanation of what the problem is.
>> Based on patch 2/2 it looks like we have lost an assignment to the
>> return register.
>>
>> To someone not familiar with this code, it sounds to me like we've made
>> a mistake earlier and we're now defining a hook that lets us go back and
>> fix that earlier mistake.   I'm probably wrong, but so far that's what
>> it sounds like.
> Hi Jeff
> 
> You're right. Let me rephrase  patch 2/2 with more details. Search /* feigao to location the point I'm
> tring to explain.
> 
> code snippets from gcc/function.cc
> void
> thread_prologue_and_epilogue_insns (void)
> {
> ...
>    /*feigao:
>          targetm.gen_epilogue () is called here to generate epilogue sequence.
> 	https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b27d323a368033f0b37e93c57a57a35fd9997864
> 	Commit above tries in targetm.gen_epilogue () to detect if
> 	there's li	a0,0 insn at the end of insn chain, if so, cm.popret
> 	is replaced by cm.popretz and li	a0,0 insn is deleted.
So that seems like the critical issue.  Generation of the 
prologue/epilogue really shouldn't be changing other instructions in the 
instruction stream.  I'm not immediately aware of another target that 
does that, an it seems like a rather risky thing to do.


It looks like the cm.popretz's RTL exposes the assignment to a0 and 
there's a DCE pass that runs after insertion of the prologue/epilogue. 
So I would suggest leaving the assignment to a0 in the RTL chain and see 
if the later DCE pass after prologue generation eliminates the redundant 
assignment.  That seems a lot cleaner.



Jeff
Fei Gao June 28, 2024, 10:46 a.m. UTC | #7
On 2024-06-09 04:36  Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
>On 6/5/24 8:42 PM, Fei Gao wrote:
>
>>> But let's back up and get a good explanation of what the problem is.
>>> Based on patch 2/2 it looks like we have lost an assignment to the
>>> return register.
>>>
>>> To someone not familiar with this code, it sounds to me like we've made
>>> a mistake earlier and we're now defining a hook that lets us go back and
>>> fix that earlier mistake.   I'm probably wrong, but so far that's what
>>> it sounds like.
>> Hi Jeff
>>
>> You're right. Let me rephrase  patch 2/2 with more details. Search /* feigao to location the point I'm
>> tring to explain.
>>
>> code snippets from gcc/function.cc
>> void
>> thread_prologue_and_epilogue_insns (void)
>> {
>> ...
>>    /*feigao:
>>          targetm.gen_epilogue () is called here to generate epilogue sequence.
>> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b27d323a368033f0b37e93c57a57a35fd9997864
>> Commit above tries in targetm.gen_epilogue () to detect if
>> there's li	a0,0 insn at the end of insn chain, if so, cm.popret 
Hi Jeff
I should have made it clear that there're  li	a0,0 and use a0 insns instead of just li a0, 0 here.
>> is replaced by cm.popretz and li	a0,0 insn is deleted.
>So that seems like the critical issue.  Generation of the
>prologue/epilogue really shouldn't be changing other instructions in the
>instruction stream.  I'm not immediately aware of another target that
>does that, an it seems like a rather risky thing to do.
>
>
>It looks like the cm.popretz's RTL exposes the assignment to a0 and
>there's a DCE pass that runs after insertion of the prologue/epilogue.
>So I would suggest leaving the assignment to a0 in the RTL chain and see
>if the later DCE pass after prologue generation eliminates the redundant
>assignment.  That seems a lot cleaner. 
The DCE pass  after prologue generation may not help here for the following reasons:
1. The use a0 insn is not deletable, and then li a0,0 that defines a0 cannot be deleted.
2. We need to detect pattern (clear a0, use a0 and cm.popret) before generating cm.popretz. 
    I don't think DCE is a good place to put this piece of codes. 
    And I insist prologue and epilogue pass is a better place to do it with simplicity and clear logic
    as I explained earlier to Kito. Hook was added here safely without any impact on other targets.

Please let me know your idea. 

Thanks. 
Fei
>
>
>
>Jeff
Jeff Law July 6, 2024, 7:52 p.m. UTC | #8
On 6/28/24 4:46 AM, Fei Gao wrote:
> On 2024-06-09 04:36  Jeff Law <jeffreyalaw@gmail.com> wrote:
>>
>>
>>
>> On 6/5/24 8:42 PM, Fei Gao wrote:
>>
>>>> But let's back up and get a good explanation of what the problem is.
>>>> Based on patch 2/2 it looks like we have lost an assignment to the
>>>> return register.
>>>>
>>>> To someone not familiar with this code, it sounds to me like we've made
>>>> a mistake earlier and we're now defining a hook that lets us go back and
>>>> fix that earlier mistake.   I'm probably wrong, but so far that's what
>>>> it sounds like.
>>> Hi Jeff
>>>
>>> You're right. Let me rephrase  patch 2/2 with more details. Search /* feigao to location the point I'm
>>> tring to explain.
>>>
>>> code snippets from gcc/function.cc
>>> void
>>> thread_prologue_and_epilogue_insns (void)
>>> {
>>> ...
>>>      /*feigao:
>>>            targetm.gen_epilogue () is called here to generate epilogue sequence.
>>> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=b27d323a368033f0b37e93c57a57a35fd9997864
>>> Commit above tries in targetm.gen_epilogue () to detect if
>>> there's li	a0,0 insn at the end of insn chain, if so, cm.popret
> Hi Jeff
> I should have made it clear that there're  li	a0,0 and use a0 insns instead of just li a0, 0 here.
Of course.  The USE represents the return value being live at the end of 
the function.  I'd assumed that.


>>> is replaced by cm.popretz and li	a0,0 insn is deleted.
>> So that seems like the critical issue.  Generation of the
>> prologue/epilogue really shouldn't be changing other instructions in the
>> instruction stream.  I'm not immediately aware of another target that
>> does that, an it seems like a rather risky thing to do.
>>
>>
>> It looks like the cm.popretz's RTL exposes the assignment to a0 and
>> there's a DCE pass that runs after insertion of the prologue/epilogue.
>> So I would suggest leaving the assignment to a0 in the RTL chain and see
>> if the later DCE pass after prologue generation eliminates the redundant
>> assignment.  That seems a lot cleaner.
> The DCE pass  after prologue generation may not help here for the following reasons:
> 1. The use a0 insn is not deletable, and then li a0,0 that defines a0 cannot be deleted.
Why not?  If we expose both assignments properly, then one of them is 
clearly dead.  Though perhaps the li a0,0 is redundant rather than dead. 
It may be the case that we need to model this more like DSE.

> 2. We need to detect pattern (clear a0, use a0 and cm.popret) before generating cm.popretz.
I'm not entirely sure of this.

>      I don't think DCE is a good place to put this piece of codes.
>      And I insist prologue and epilogue pass is a better place to do it with simplicity and clear logic
>      as I explained earlier to Kito. Hook was added here safely without any impact on other targets.
I hook is just papering over the fact that the epilogue generation with 
popretz is wrong.

Jeff
Jeff Law July 7, 2024, 2:53 p.m. UTC | #9
On 6/8/24 2:36 PM, Jeff Law wrote:
> 
> 
> On 6/5/24 8:42 PM, Fei Gao wrote:
> 
>>> But let's back up and get a good explanation of what the problem is.
>>> Based on patch 2/2 it looks like we have lost an assignment to the
>>> return register.
>>>
>>> To someone not familiar with this code, it sounds to me like we've made
>>> a mistake earlier and we're now defining a hook that lets us go back and
>>> fix that earlier mistake.   I'm probably wrong, but so far that's what
>>> it sounds like.
>> Hi Jeff
>>
>> You're right. Let me rephrase  patch 2/2 with more details. Search /* 
>> feigao to location the point I'm
>> tring to explain.
>>
>> code snippets from gcc/function.cc
>> void
>> thread_prologue_and_epilogue_insns (void)
>> {
>> ...
>>    /*feigao:
>>          targetm.gen_epilogue () is called here to generate epilogue 
>> sequence.
>>     https://gcc.gnu.org/git/? 
>> p=gcc.git;a=commit;h=b27d323a368033f0b37e93c57a57a35fd9997864
>>     Commit above tries in targetm.gen_epilogue () to detect if
>>     there's li    a0,0 insn at the end of insn chain, if so, cm.popret
>>     is replaced by cm.popretz and li    a0,0 insn is deleted.
> So that seems like the critical issue.  Generation of the prologue/ 
> epilogue really shouldn't be changing other instructions in the 
> instruction stream.  I'm not immediately aware of another target that 
> does that, an it seems like a rather risky thing to do.
> 
> 
> It looks like the cm.popretz's RTL exposes the assignment to a0 and 
> there's a DCE pass that runs after insertion of the prologue/epilogue. 
> So I would suggest leaving the assignment to a0 in the RTL chain and see 
> if the later DCE pass after prologue generation eliminates the redundant 
> assignment.  That seems a lot cleaner.
So I looked at this in a bit more detail.  I'm going to explicitly 
reject this patch now.

The removal of the set to a0 in riscv_gen_multi_pop_insn looks wrong on 
multiple levels.  I don't think you have enough context in that routine 
or its callers to know if it's safe  ie given this fragment of RTL:

> (call_insn 12 11 13 3 (parallel [
>             (call (mem:SI (symbol_ref:SI ("test_1") [flags 0x41] <function_decl 0x7ffff7796d00 test_1>) [0 test_1 S4 A32])
>                 (const_int 0 [0]))
>             (use (unspec:SI [
>                         (const_int 0 [0])
>                     ] UNSPEC_CALLEE_CC))
>             (clobber (reg:SI 1 ra))
>         ]) "j.c":14:9 441 {call_internal}
>      (expr_list:REG_CALL_DECL (symbol_ref:SI ("test_1") [flags 0x41] <function_decl 0x7ffff7796d00 test_1>)
>         (nil))
>     (expr_list:SI (use (reg:SI 10 a0))
>         (nil)))
> 
> (code_label 13 12 14 4 2 (nil) [1 uses])
> 
> (note 14 13 19 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
> 
> (insn 19 14 20 4 (set (reg/i:SI 10 a0)
>         (const_int 0 [0])) "j.c":18:1 276 {*movsi_internal}
>      (nil))
> 
> (insn 20 19 24 4 (use (reg/i:SI 10 a0)) "j.c":18:1 -1
>      (nil))


You delete insns 19 and 20 resulting in this:

> (call_insn 12 11 13 3 (parallel [
>             (call (mem:SI (symbol_ref:SI ("test_1") [flags 0x41] <function_decl 0x7ffff7796d00 test_1>) [0 test_1 S4 A32])
>                 (const_int 0 [0]))
>             (use (unspec:SI [
>                         (const_int 0 [0])
>                     ] UNSPEC_CALLEE_CC))
>             (clobber (reg:SI 1 ra))
>         ]) "j.c":14:9 441 {call_internal}
>      (expr_list:REG_CALL_DECL (symbol_ref:SI ("test_1") [flags 0x41] <function_decl 0x7ffff7796d00 test_1>)
>         (nil))
>     (expr_list:SI (use (reg:SI 10 a0))
>         (nil)))
> 
> (code_label 13 12 14 4 2 (nil) [1 uses])
> 
> (note 14 13 24 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
> 
> (note 24 14 0 NOTE_INSN_DELETED)

Which is incorrect/inconsistent RTL.  And as I've noted before, it's 
conceptually wrong for the backend code to be removing insns from the 
insn chain during prologue/epilogue generation.

I realize you're trying to use a hook to limit how this impacts other 
targets, but if you're making a bad decision in the RISC-V backend code, 
working around it with a target hook rather than fixing the core problem 
in the RISC-V backend just makes the whole situation worse.

My suggest is this.  Leave the assignment to a0 and use alone.  That's 
likely going to result in some kind of code size regression, but not a 
correctness regression.  Then address the code size regressions with the 
invariant that prologue/epilogue generation must not change existing 
insns on the insn chain.

jeff
Fei Gao July 8, 2024, 5:53 a.m. UTC | #10
On 2024-07-07 22:53  Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
>On 6/8/24 2:36 PM, Jeff Law wrote:
>>
>>
>> On 6/5/24 8:42 PM, Fei Gao wrote:
>>
>>>> But let's back up and get a good explanation of what the problem is.
>>>> Based on patch 2/2 it looks like we have lost an assignment to the
>>>> return register.
>>>>
>>>> To someone not familiar with this code, it sounds to me like we've made
>>>> a mistake earlier and we're now defining a hook that lets us go back and
>>>> fix that earlier mistake.   I'm probably wrong, but so far that's what
>>>> it sounds like.
>>> Hi Jeff
>>>
>>> You're right. Let me rephrase  patch 2/2 with more details. Search /*
>>> feigao to location the point I'm
>>> tring to explain.
>>>
>>> code snippets from gcc/function.cc
>>> void
>>> thread_prologue_and_epilogue_insns (void)
>>> {
>>> ...
>>>    /*feigao:
>>>          targetm.gen_epilogue () is called here to generate epilogue
>>> sequence.
>>>     https://gcc.gnu.org/git/?
>>> p=gcc.git;a=commit;h=b27d323a368033f0b37e93c57a57a35fd9997864
>>>     Commit above tries in targetm.gen_epilogue () to detect if
>>>     there's li    a0,0 insn at the end of insn chain, if so, cm.popret
>>>     is replaced by cm.popretz and li    a0,0 insn is deleted.
>> So that seems like the critical issue.  Generation of the prologue/
>> epilogue really shouldn't be changing other instructions in the
>> instruction stream.  I'm not immediately aware of another target that
>> does that, an it seems like a rather risky thing to do.
>>
>>
>> It looks like the cm.popretz's RTL exposes the assignment to a0 and
>> there's a DCE pass that runs after insertion of the prologue/epilogue.
>> So I would suggest leaving the assignment to a0 in the RTL chain and see
>> if the later DCE pass after prologue generation eliminates the redundant
>> assignment.  That seems a lot cleaner.
>So I looked at this in a bit more detail.  I'm going to explicitly
>reject this patch now.
>
>The removal of the set to a0 in riscv_gen_multi_pop_insn looks wrong on
>multiple levels.  I don't think you have enough context in that routine
>or its callers to know if it's safe  ie given this fragment of RTL:
>
>> (call_insn 12 11 13 3 (parallel [
>>             (call (mem:SI (symbol_ref:SI ("test_1") [flags 0x41] <function_decl 0x7ffff7796d00 test_1>) [0 test_1 S4 A32])
>>                 (const_int 0 [0]))
>>             (use (unspec:SI [
>>                         (const_int 0 [0])
>>                     ] UNSPEC_CALLEE_CC))
>>             (clobber (reg:SI 1 ra))
>>         ]) "j.c":14:9 441 {call_internal}
>>      (expr_list:REG_CALL_DECL (symbol_ref:SI ("test_1") [flags 0x41] <function_decl 0x7ffff7796d00 test_1>)
>>         (nil))
>>     (expr_list:SI (use (reg:SI 10 a0))
>>         (nil)))
>>
>> (code_label 13 12 14 4 2 (nil) [1 uses])
>>
>> (note 14 13 19 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
>>
>> (insn 19 14 20 4 (set (reg/i:SI 10 a0)
>>         (const_int 0 [0])) "j.c":18:1 276 {*movsi_internal}
>>      (nil))
>>
>> (insn 20 19 24 4 (use (reg/i:SI 10 a0)) "j.c":18:1 -1
>>      (nil))
>
>
>You delete insns 19 and 20 resulting in this:
>
>> (call_insn 12 11 13 3 (parallel [
>>             (call (mem:SI (symbol_ref:SI ("test_1") [flags 0x41] <function_decl 0x7ffff7796d00 test_1>) [0 test_1 S4 A32])
>>                 (const_int 0 [0]))
>>             (use (unspec:SI [
>>                         (const_int 0 [0])
>>                     ] UNSPEC_CALLEE_CC))
>>             (clobber (reg:SI 1 ra))
>>         ]) "j.c":14:9 441 {call_internal}
>>      (expr_list:REG_CALL_DECL (symbol_ref:SI ("test_1") [flags 0x41] <function_decl 0x7ffff7796d00 test_1>)
>>         (nil))
>>     (expr_list:SI (use (reg:SI 10 a0))
>>         (nil)))
>>
>> (code_label 13 12 14 4 2 (nil) [1 uses])
>>
>> (note 14 13 24 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
>>
>> (note 24 14 0 NOTE_INSN_DELETED)
>
>Which is incorrect/inconsistent RTL.  And as I've noted before, it's
>conceptually wrong for the backend code to be removing insns from the
>insn chain during prologue/epilogue generation.
>
>I realize you're trying to use a hook to limit how this impacts other
>targets, but if you're making a bad decision in the RISC-V backend code,
>working around it with a target hook rather than fixing the core problem
>in the RISC-V backend just makes the whole situation worse.
>
>My suggest is this.  Leave the assignment to a0 and use alone.  That's
>likely going to result in some kind of code size regression, but not a
>correctness regression.  Then address the code size regressions with the
>invariant that prologue/epilogue generation must not change existing
>insns on the insn chain. 

Hi Jeff

Thanks for your patience.

Got your point never remove insns from the
insn chain during prologue/epilogue generation. 

As you suggested, I will fix this issue by leaving the assignment to a0 and use insn
with cm.popret. Then optimize to cm.popretz in correct pass in future if possilbe.

>jeff
>
>
>
>
>
>
>
>
>
>