diff mbox series

[v3,3/4] ree: Main functionality to Improve ree pass for rs6000 target

Message ID 12889922-0160-a036-7dbf-1d208e353f82@linux.ibm.com
State New
Headers show
Series [v3,1/4] ree: Default ree pass for O2 and above for rs6000 target. | expand

Commit Message

Ajit Agarwal April 19, 2023, 6 p.m. UTC
Hello All:

This is patch-3 to improve ree pass for rs6000 target.
Main functionality routines to imprve ree pass.

Bootstrapped and regtested on powerpc64-gnu-linux.

Thanks & Regards
Ajit

	ree: Improve ree pass for rs6000 target.

	For rs6000 target we see redundant zero and sign
	extension and done to improve ree pass to eliminate
	such redundant zero and sign extension. Support of
	zero_extend/sign_extend/AND.

	2023-04-19  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>

gcc/ChangeLog:

	* ree.cc (eliminate_across_bbs_p): Add checks to enable extension
	elimination across and within basic blocks.
	(def_arith_p): New function to check definition has arithmetic
	operation.
	(combine_set_extension): Modification to incorporate AND
	and current zero_extend and sign_extend instruction.
	(merge_def_and_ext): Add calls to eliminate_across_bbs_p and
	zero_extend sign_extend and AND instruction.
	(rtx_is_zext_p): New function.
	(reg_used_set_between_p): New function.

gcc/testsuite/ChangeLog:

	* g++.target/powerpc/zext-elim.C: New testcase.
	* g++.target/powerpc/zext-elim-1.C: New testcase.
	* g++.target/powerpc/zext-elim-2.C: New testcase.
	* g++.target/powerpc/sext-elim.C: New testcase.
---
 gcc/ree.cc                                    | 451 ++++++++++++++++--
 gcc/testsuite/g++.target/powerpc/sext-elim.C  |  18 +
 .../g++.target/powerpc/zext-elim-1.C          |  19 +
 .../g++.target/powerpc/zext-elim-2.C          |  11 +
 gcc/testsuite/g++.target/powerpc/zext-elim.C  |  30 ++
 5 files changed, 482 insertions(+), 47 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/powerpc/sext-elim.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-1.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-2.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim.C

Comments

Jeff Law April 19, 2023, 9:53 p.m. UTC | #1
On 4/19/23 12:00, Ajit Agarwal wrote:
> Hello All:
> 
> This is patch-3 to improve ree pass for rs6000 target.
> Main functionality routines to imprve ree pass.
> 
> Bootstrapped and regtested on powerpc64-gnu-linux.
> 
> Thanks & Regards
> Ajit
> 
> 	ree: Improve ree pass for rs6000 target.
> 
> 	For rs6000 target we see redundant zero and sign
> 	extension and done to improve ree pass to eliminate
> 	such redundant zero and sign extension. Support of
> 	zero_extend/sign_extend/AND.
> 
> 	2023-04-19  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>
> 
> gcc/ChangeLog:
> 
> 	* ree.cc (eliminate_across_bbs_p): Add checks to enable extension
> 	elimination across and within basic blocks.
> 	(def_arith_p): New function to check definition has arithmetic
> 	operation.
> 	(combine_set_extension): Modification to incorporate AND
> 	and current zero_extend and sign_extend instruction.
> 	(merge_def_and_ext): Add calls to eliminate_across_bbs_p and
> 	zero_extend sign_extend and AND instruction.
> 	(rtx_is_zext_p): New function.
> 	(reg_used_set_between_p): New function.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* g++.target/powerpc/zext-elim.C: New testcase.
> 	* g++.target/powerpc/zext-elim-1.C: New testcase.
> 	* g++.target/powerpc/zext-elim-2.C: New testcase.
> 	* g++.target/powerpc/sext-elim.C: New testcase.
> ---
>   gcc/ree.cc                                    | 451 ++++++++++++++++--
>   gcc/testsuite/g++.target/powerpc/sext-elim.C  |  18 +
>   .../g++.target/powerpc/zext-elim-1.C          |  19 +
>   .../g++.target/powerpc/zext-elim-2.C          |  11 +
>   gcc/testsuite/g++.target/powerpc/zext-elim.C  |  30 ++
>   5 files changed, 482 insertions(+), 47 deletions(-)
>   create mode 100644 gcc/testsuite/g++.target/powerpc/sext-elim.C
>   create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-1.C
>   create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-2.C
>   create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim.C
> 
> diff --git a/gcc/ree.cc b/gcc/ree.cc
> index 413aec7c8eb..053db2e8ff3 100644
> --- a/gcc/ree.cc
> +++ b/gcc/ree.cc
> @@ -253,6 +253,71 @@ struct ext_cand
>   
>   static int max_insn_uid;
>   
> +bool
> +reg_used_set_between_p (rtx set, rtx_insn *def_insn, rtx_insn *insn
> +{
> +  if (reg_used_between_p (set, def_insn, insn)
> +      || reg_set_between_p (set, def_insn, insn))
> +    return true;
> +
> +  return false;
> +}
This seems general enough that it should go into the same file as 
reg_used_between_p and reg_set_between_p.  It needs a function comment 
as well.


> +static unsigned int
> +rtx_is_zext_p (rtx insn)
> +{
> +  if (GET_CODE (insn) == AND)
> +    {
> +      rtx set = XEXP (insn, 0);
> +      if (REG_P (set))
> +	{
> +	  if (XEXP (insn, 1) == const1_rtx)
> +	    return 1;
> +	}
> +      else
> +	return 0;
> +    }
> +
> +  return 0;
> +}
So my comment from the prior version stands.  Testing for const1_rtx is 
just wrong.  The optimization you're trying to perform (If I understand 
it correctly) works for many other constants and the set of constants 
supported will vary based on the input and output modes.

Similarly in rtx_is_zext_p.

You still have numerous formatting issues which makes reading the patch 
more difficult than it should be.  Please review the formatting 
guidelines and follow them.   In particular please review how to indent 
multi-line conditionals.





You sti
> @@ -698,6 +777,226 @@ get_sub_rtx (rtx_insn *def_insn)
>     return sub_rtx;
>   }
>   
> +/* Check if the def insn is ASHIFT and LSHIFTRT.
> +  Inputs: insn for which def has to be checked.
> +	  source operand rtx.
> +   Output: True or false if def has arithmetic
> +	   peration like ASHIFT and LSHIFTRT.  */
This still needs work.  Between the comments and code, I still don't 
know what you're really trying to do here.  I can make some guesses, but 
it's really your job to write clear comments about what you're doing so 
that a review or someone looking at the code in the future don't have to 
guess.

It looks like you want to look at all the reaching definitions of INSN 
for ORIG_SRC and if they are ASHIFT/LSHIFTRT do...  what?

Why are ASHIFT/LSHIFTRT interesting here?  Why are you looking for them?



> +
> +/* Find feasibility of extension elimination
> +   across basic blocks.
> +   Input: candiate to check the feasibility.
> +	  def_insn of candidate.
> +   Output: Returns true or false if feasible or not.  */
Function comments should read like complete sentences.  Arguments should 
be in all caps.  There are numerous examples in ree.cc you can follow.

There's no comments in this code which describe the properties of the 
CFG/blocks you are looking for (domination, post-domination, whatever). 
It's jsut a series of tests with no clear explanation of what you're 
looking for or why any particular test exists.

As far as I can tell you're looking at the predecessors of cand->insn 
and make some decisions based on what you find.  In some cases you 
return false in others you return true.  But there's zero indication of 
why you do anything.

You're checking RTL in these cases, but I suspect you really want to be 
doing some kind of CFG property checks.  But since you haven't described 
what you're looking for, I can't make suggestions for the right queries 
to make.

I stopped trying to review the function at this point.  It needs major 
work.  Let's start with the block/CFG properties you're looking for. 
We'll then have to go through each section of code and repeat the process.

In fact I might recommend pulling the code which is trying to determine 
CFG properties into its own function.  Then try to come up with a good 
name for that function and a good function comment.

THe next chunk of code you start looking at the properties of the 
candidate insn.  That seems like it ought to break out into its own 
function as well.

THe process of refactoring, choosing good names and function comments 
should make the overall intent of this code clearer.



After that's done we'll review that work and perhaps go further.

jeff
Ajit Agarwal April 20, 2023, 9:03 p.m. UTC | #2
Hello Jeff:

On 20/04/23 3:23 am, Jeff Law wrote:
> 
> 
> On 4/19/23 12:00, Ajit Agarwal wrote:
>> Hello All:
>>
>> This is patch-3 to improve ree pass for rs6000 target.
>> Main functionality routines to imprve ree pass.
>>
>> Bootstrapped and regtested on powerpc64-gnu-linux.
>>
>> Thanks & Regards
>> Ajit
>>
>>     ree: Improve ree pass for rs6000 target.
>>
>>     For rs6000 target we see redundant zero and sign
>>     extension and done to improve ree pass to eliminate
>>     such redundant zero and sign extension. Support of
>>     zero_extend/sign_extend/AND.
>>
>>     2023-04-19  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>
>>
>> gcc/ChangeLog:
>>
>>     * ree.cc (eliminate_across_bbs_p): Add checks to enable extension
>>     elimination across and within basic blocks.
>>     (def_arith_p): New function to check definition has arithmetic
>>     operation.
>>     (combine_set_extension): Modification to incorporate AND
>>     and current zero_extend and sign_extend instruction.
>>     (merge_def_and_ext): Add calls to eliminate_across_bbs_p and
>>     zero_extend sign_extend and AND instruction.
>>     (rtx_is_zext_p): New function.
>>     (reg_used_set_between_p): New function.
>>
>> gcc/testsuite/ChangeLog:
>>
>>     * g++.target/powerpc/zext-elim.C: New testcase.
>>     * g++.target/powerpc/zext-elim-1.C: New testcase.
>>     * g++.target/powerpc/zext-elim-2.C: New testcase.
>>     * g++.target/powerpc/sext-elim.C: New testcase.
>> ---
>>   gcc/ree.cc                                    | 451 ++++++++++++++++--
>>   gcc/testsuite/g++.target/powerpc/sext-elim.C  |  18 +
>>   .../g++.target/powerpc/zext-elim-1.C          |  19 +
>>   .../g++.target/powerpc/zext-elim-2.C          |  11 +
>>   gcc/testsuite/g++.target/powerpc/zext-elim.C  |  30 ++
>>   5 files changed, 482 insertions(+), 47 deletions(-)
>>   create mode 100644 gcc/testsuite/g++.target/powerpc/sext-elim.C
>>   create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-1.C
>>   create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-2.C
>>   create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim.C
>>
>> diff --git a/gcc/ree.cc b/gcc/ree.cc
>> index 413aec7c8eb..053db2e8ff3 100644
>> --- a/gcc/ree.cc
>> +++ b/gcc/ree.cc
>> @@ -253,6 +253,71 @@ struct ext_cand
>>     static int max_insn_uid;
>>   +bool
>> +reg_used_set_between_p (rtx set, rtx_insn *def_insn, rtx_insn *insn
>> +{
>> +  if (reg_used_between_p (set, def_insn, insn)
>> +      || reg_set_between_p (set, def_insn, insn))
>> +    return true;
>> +
>> +  return false;
>> +}
> This seems general enough that it should go into the same file as reg_used_between_p and reg_set_between_p.  It needs a function comment as well.
> 
> 
>> +static unsigned int
>> +rtx_is_zext_p (rtx insn)
>> +{
>> +  if (GET_CODE (insn) == AND)
>> +    {
>> +      rtx set = XEXP (insn, 0);
>> +      if (REG_P (set))
>> +    {
>> +      if (XEXP (insn, 1) == const1_rtx)
>> +        return 1;
>> +    }
>> +      else
>> +    return 0;
>> +    }
>> +
>> +  return 0;
>> +}
> So my comment from the prior version stands.  Testing for const1_rtx is just wrong.  The optimization you're trying to perform (If I understand it correctly) works for many other constants and the set of constants supported will vary based on the input and output modes.
> 
> Similarly in rtx_is_zext_p.
> 
> You still have numerous formatting issues which makes reading the patch more difficult than it should be.  Please review the formatting guidelines and follow them.   In particular please review how to indent multi-line conditionals.
> 
> 

Currently I support AND with const1_rtx. This is what is equivalent to zero extension instruction in power instruction set. When you specify many other constants and Could you please specify what other constants needs to be supported and how to determine on the Input and output modes.
> 
> 
> 
> You sti
>> @@ -698,6 +777,226 @@ get_sub_rtx (rtx_insn *def_insn)
>>     return sub_rtx;
>>   }
>>   +/* Check if the def insn is ASHIFT and LSHIFTRT.
>> +  Inputs: insn for which def has to be checked.
>> +      source operand rtx.
>> +   Output: True or false if def has arithmetic
>> +       peration like ASHIFT and LSHIFTRT.  */
> This still needs work.  Between the comments and code, I still don't know what you're really trying to do here.  I can make some guesses, but it's really your job to write clear comments about what you're doing so that a review or someone looking at the code in the future don't have to guess.
> 
> It looks like you want to look at all the reaching definitions of INSN for ORIG_SRC and if they are ASHIFT/LSHIFTRT do...  what?
> 
> Why are ASHIFT/LSHIFTRT interesting here?  Why are you looking for them?
> 
> 
> 
>> +
>> +/* Find feasibility of extension elimination
>> +   across basic blocks.
>> +   Input: candiate to check the feasibility.
>> +      def_insn of candidate.
>> +   Output: Returns true or false if feasible or not.  */
> Function comments should read like complete sentences.  Arguments should be in all caps.  There are numerous examples in ree.cc you can follow.
> 
> There's no comments in this code which describe the properties of the CFG/blocks you are looking for (domination, post-domination, whatever). It's jsut a series of tests with no clear explanation of what you're looking for or why any particular test exists.
> 
> As far as I can tell you're looking at the predecessors of cand->insn and make some decisions based on what you find.  In some cases you return false in others you return true.  But there's zero indication of why you do anything.
> 
> You're checking RTL in these cases, but I suspect you really want to be doing some kind of CFG property checks.  But since you haven't described what you're looking for, I can't make suggestions for the right queries to make.
> 
> I stopped trying to review the function at this point.  It needs major work.  Let's start with the block/CFG properties you're looking for. We'll then have to go through each section of code and repeat the process.
> 
> In fact I might recommend pulling the code which is trying to determine CFG properties into its own function.  Then try to come up with a good name for that function and a good function comment.
> 
> THe next chunk of code you start looking at the properties of the candidate insn.  That seems like it ought to break out into its own function as well.
> 
> THe process of refactoring, choosing good names and function comments should make the overall intent of this code clearer.
> 
> 
>

I am working on this item and will incorporate them.

Thanks & Regards
Ajit
 
> After that's done we'll review that work and perhaps go further.
> 
> jeff
Ajit Agarwal April 20, 2023, 9:18 p.m. UTC | #3
Hello Jeff:


On 21/04/23 2:33 am, Ajit Agarwal via Gcc-patches wrote:
> Hello Jeff:
> 
> On 20/04/23 3:23 am, Jeff Law wrote:
>>
>>
>> On 4/19/23 12:00, Ajit Agarwal wrote:
>>> Hello All:
>>>
>>> This is patch-3 to improve ree pass for rs6000 target.
>>> Main functionality routines to imprve ree pass.
>>>
>>> Bootstrapped and regtested on powerpc64-gnu-linux.
>>>
>>> Thanks & Regards
>>> Ajit
>>>
>>>     ree: Improve ree pass for rs6000 target.
>>>
>>>     For rs6000 target we see redundant zero and sign
>>>     extension and done to improve ree pass to eliminate
>>>     such redundant zero and sign extension. Support of
>>>     zero_extend/sign_extend/AND.
>>>
>>>     2023-04-19  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>
>>>
>>> gcc/ChangeLog:
>>>
>>>     * ree.cc (eliminate_across_bbs_p): Add checks to enable extension
>>>     elimination across and within basic blocks.
>>>     (def_arith_p): New function to check definition has arithmetic
>>>     operation.
>>>     (combine_set_extension): Modification to incorporate AND
>>>     and current zero_extend and sign_extend instruction.
>>>     (merge_def_and_ext): Add calls to eliminate_across_bbs_p and
>>>     zero_extend sign_extend and AND instruction.
>>>     (rtx_is_zext_p): New function.
>>>     (reg_used_set_between_p): New function.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>     * g++.target/powerpc/zext-elim.C: New testcase.
>>>     * g++.target/powerpc/zext-elim-1.C: New testcase.
>>>     * g++.target/powerpc/zext-elim-2.C: New testcase.
>>>     * g++.target/powerpc/sext-elim.C: New testcase.
>>> ---
>>>   gcc/ree.cc                                    | 451 ++++++++++++++++--
>>>   gcc/testsuite/g++.target/powerpc/sext-elim.C  |  18 +
>>>   .../g++.target/powerpc/zext-elim-1.C          |  19 +
>>>   .../g++.target/powerpc/zext-elim-2.C          |  11 +
>>>   gcc/testsuite/g++.target/powerpc/zext-elim.C  |  30 ++
>>>   5 files changed, 482 insertions(+), 47 deletions(-)
>>>   create mode 100644 gcc/testsuite/g++.target/powerpc/sext-elim.C
>>>   create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-1.C
>>>   create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-2.C
>>>   create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim.C
>>>
>>> diff --git a/gcc/ree.cc b/gcc/ree.cc
>>> index 413aec7c8eb..053db2e8ff3 100644
>>> --- a/gcc/ree.cc
>>> +++ b/gcc/ree.cc
>>> @@ -253,6 +253,71 @@ struct ext_cand
>>>     static int max_insn_uid;
>>>   +bool
>>> +reg_used_set_between_p (rtx set, rtx_insn *def_insn, rtx_insn *insn
>>> +{
>>> +  if (reg_used_between_p (set, def_insn, insn)
>>> +      || reg_set_between_p (set, def_insn, insn))
>>> +    return true;
>>> +
>>> +  return false;
>>> +}
>> This seems general enough that it should go into the same file as reg_used_between_p and reg_set_between_p.  It needs a function comment as well.
>>
>>
>>> +static unsigned int
>>> +rtx_is_zext_p (rtx insn)
>>> +{
>>> +  if (GET_CODE (insn) == AND)
>>> +    {
>>> +      rtx set = XEXP (insn, 0);
>>> +      if (REG_P (set))
>>> +    {
>>> +      if (XEXP (insn, 1) == const1_rtx)
>>> +        return 1;
>>> +    }
>>> +      else
>>> +    return 0;
>>> +    }
>>> +
>>> +  return 0;
>>> +}
>> So my comment from the prior version stands.  Testing for const1_rtx is just wrong.  The optimization you're trying to perform (If I understand it correctly) works for many other constants and the set of constants supported will vary based on the input and output modes.
>>
>> Similarly in rtx_is_zext_p.
>>
>> You still have numerous formatting issues which makes reading the patch more difficult than it should be.  Please review the formatting guidelines and follow them.   In particular please review how to indent multi-line conditionals.
>>
>>
> 
> Currently I support AND with const1_rtx. This is what is equivalent to zero extension instruction in power instruction set. When you specify many other constants and Could you please specify what other constants needs to be supported and how to determine on the Input and output modes.
>>
On top of that I support eliminating zero_extend and sign_extend wherein if result mode of def insn not equal to source operand of zero_extend and sign_extend.

Thanks & Regards
Ajit
>>
>>
>> You sti
>>> @@ -698,6 +777,226 @@ get_sub_rtx (rtx_insn *def_insn)
>>>     return sub_rtx;
>>>   }
>>>   +/* Check if the def insn is ASHIFT and LSHIFTRT.
>>> +  Inputs: insn for which def has to be checked.
>>> +      source operand rtx.
>>> +   Output: True or false if def has arithmetic
>>> +       peration like ASHIFT and LSHIFTRT.  */
>> This still needs work.  Between the comments and code, I still don't know what you're really trying to do here.  I can make some guesses, but it's really your job to write clear comments about what you're doing so that a review or someone looking at the code in the future don't have to guess.
>>
>> It looks like you want to look at all the reaching definitions of INSN for ORIG_SRC and if they are ASHIFT/LSHIFTRT do...  what?
>>
>> Why are ASHIFT/LSHIFTRT interesting here?  Why are you looking for them?
>>
>>
>>
>>> +
>>> +/* Find feasibility of extension elimination
>>> +   across basic blocks.
>>> +   Input: candiate to check the feasibility.
>>> +      def_insn of candidate.
>>> +   Output: Returns true or false if feasible or not.  */
>> Function comments should read like complete sentences.  Arguments should be in all caps.  There are numerous examples in ree.cc you can follow.
>>
>> There's no comments in this code which describe the properties of the CFG/blocks you are looking for (domination, post-domination, whatever). It's jsut a series of tests with no clear explanation of what you're looking for or why any particular test exists.
>>
>> As far as I can tell you're looking at the predecessors of cand->insn and make some decisions based on what you find.  In some cases you return false in others you return true.  But there's zero indication of why you do anything.
>>
>> You're checking RTL in these cases, but I suspect you really want to be doing some kind of CFG property checks.  But since you haven't described what you're looking for, I can't make suggestions for the right queries to make.
>>
>> I stopped trying to review the function at this point.  It needs major work.  Let's start with the block/CFG properties you're looking for. We'll then have to go through each section of code and repeat the process.
>>
>> In fact I might recommend pulling the code which is trying to determine CFG properties into its own function.  Then try to come up with a good name for that function and a good function comment.
>>
>> THe next chunk of code you start looking at the properties of the candidate insn.  That seems like it ought to break out into its own function as well.
>>
>> THe process of refactoring, choosing good names and function comments should make the overall intent of this code clearer.
>>
>>
>>
> 
> I am working on this item and will incorporate them.
> 
> Thanks & Regards
> Ajit
>  
>> After that's done we'll review that work and perhaps go further.
>>
>> jeff
Ajit Agarwal April 20, 2023, 9:21 p.m. UTC | #4
Hello Jeff:

On 21/04/23 2:33 am, Ajit Agarwal wrote:
> Hello Jeff:
> 
> On 20/04/23 3:23 am, Jeff Law wrote:
>>
>>
>> On 4/19/23 12:00, Ajit Agarwal wrote:
>>> Hello All:
>>>
>>> This is patch-3 to improve ree pass for rs6000 target.
>>> Main functionality routines to imprve ree pass.
>>>
>>> Bootstrapped and regtested on powerpc64-gnu-linux.
>>>
>>> Thanks & Regards
>>> Ajit
>>>
>>>     ree: Improve ree pass for rs6000 target.
>>>
>>>     For rs6000 target we see redundant zero and sign
>>>     extension and done to improve ree pass to eliminate
>>>     such redundant zero and sign extension. Support of
>>>     zero_extend/sign_extend/AND.
>>>
>>>     2023-04-19  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>
>>>
>>> gcc/ChangeLog:
>>>
>>>     * ree.cc (eliminate_across_bbs_p): Add checks to enable extension
>>>     elimination across and within basic blocks.
>>>     (def_arith_p): New function to check definition has arithmetic
>>>     operation.
>>>     (combine_set_extension): Modification to incorporate AND
>>>     and current zero_extend and sign_extend instruction.
>>>     (merge_def_and_ext): Add calls to eliminate_across_bbs_p and
>>>     zero_extend sign_extend and AND instruction.
>>>     (rtx_is_zext_p): New function.
>>>     (reg_used_set_between_p): New function.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>     * g++.target/powerpc/zext-elim.C: New testcase.
>>>     * g++.target/powerpc/zext-elim-1.C: New testcase.
>>>     * g++.target/powerpc/zext-elim-2.C: New testcase.
>>>     * g++.target/powerpc/sext-elim.C: New testcase.
>>> ---
>>>   gcc/ree.cc                                    | 451 ++++++++++++++++--
>>>   gcc/testsuite/g++.target/powerpc/sext-elim.C  |  18 +
>>>   .../g++.target/powerpc/zext-elim-1.C          |  19 +
>>>   .../g++.target/powerpc/zext-elim-2.C          |  11 +
>>>   gcc/testsuite/g++.target/powerpc/zext-elim.C  |  30 ++
>>>   5 files changed, 482 insertions(+), 47 deletions(-)
>>>   create mode 100644 gcc/testsuite/g++.target/powerpc/sext-elim.C
>>>   create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-1.C
>>>   create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim-2.C
>>>   create mode 100644 gcc/testsuite/g++.target/powerpc/zext-elim.C
>>>
>>> diff --git a/gcc/ree.cc b/gcc/ree.cc
>>> index 413aec7c8eb..053db2e8ff3 100644
>>> --- a/gcc/ree.cc
>>> +++ b/gcc/ree.cc
>>> @@ -253,6 +253,71 @@ struct ext_cand
>>>     static int max_insn_uid;
>>>   +bool
>>> +reg_used_set_between_p (rtx set, rtx_insn *def_insn, rtx_insn *insn
>>> +{
>>> +  if (reg_used_between_p (set, def_insn, insn)
>>> +      || reg_set_between_p (set, def_insn, insn))
>>> +    return true;
>>> +
>>> +  return false;
>>> +}
>> This seems general enough that it should go into the same file as reg_used_between_p and reg_set_between_p.  It needs a function comment as well.
>>
>>
>>> +static unsigned int
>>> +rtx_is_zext_p (rtx insn)
>>> +{
>>> +  if (GET_CODE (insn) == AND)
>>> +    {
>>> +      rtx set = XEXP (insn, 0);
>>> +      if (REG_P (set))
>>> +    {
>>> +      if (XEXP (insn, 1) == const1_rtx)
>>> +        return 1;
>>> +    }
>>> +      else
>>> +    return 0;
>>> +    }
>>> +
>>> +  return 0;
>>> +}
>> So my comment from the prior version stands.  Testing for const1_rtx is just wrong.  The optimization you're trying to perform (If I understand it correctly) works for many other constants and the set of constants supported will vary based on the input and output modes.
>>
>> Similarly in rtx_is_zext_p.
>>
>> You still have numerous formatting issues which makes reading the patch more difficult than it should be.  Please review the formatting guidelines and follow them.   In particular please review how to indent multi-line conditionals.
>>
>>
> 
> Currently I support AND with const1_rtx. This is what is equivalent to zero extension instruction in power instruction set. When you specify many other constants and Could you please specify what other constants needs to be supported and how to determine on the Input and output modes.

On top of that I support eliminating zero_extend and sign_extend wherein if result mode of def insn not equal to source operand of zero_extend and sign_extend.

Thanks & Regards
Ajit
>>
>>
>>
>> You sti
>>> @@ -698,6 +777,226 @@ get_sub_rtx (rtx_insn *def_insn)
>>>     return sub_rtx;
>>>   }
>>>   +/* Check if the def insn is ASHIFT and LSHIFTRT.
>>> +  Inputs: insn for which def has to be checked.
>>> +      source operand rtx.
>>> +   Output: True or false if def has arithmetic
>>> +       peration like ASHIFT and LSHIFTRT.  */
>> This still needs work.  Between the comments and code, I still don't know what you're really trying to do here.  I can make some guesses, but it's really your job to write clear comments about what you're doing so that a review or someone looking at the code in the future don't have to guess.
>>
>> It looks like you want to look at all the reaching definitions of INSN for ORIG_SRC and if they are ASHIFT/LSHIFTRT do...  what?
>>
>> Why are ASHIFT/LSHIFTRT interesting here?  Why are you looking for them?
>>
>>
>>
>>> +
>>> +/* Find feasibility of extension elimination
>>> +   across basic blocks.
>>> +   Input: candiate to check the feasibility.
>>> +      def_insn of candidate.
>>> +   Output: Returns true or false if feasible or not.  */
>> Function comments should read like complete sentences.  Arguments should be in all caps.  There are numerous examples in ree.cc you can follow.
>>
>> There's no comments in this code which describe the properties of the CFG/blocks you are looking for (domination, post-domination, whatever). It's jsut a series of tests with no clear explanation of what you're looking for or why any particular test exists.
>>
>> As far as I can tell you're looking at the predecessors of cand->insn and make some decisions based on what you find.  In some cases you return false in others you return true.  But there's zero indication of why you do anything.
>>
>> You're checking RTL in these cases, but I suspect you really want to be doing some kind of CFG property checks.  But since you haven't described what you're looking for, I can't make suggestions for the right queries to make.
>>
>> I stopped trying to review the function at this point.  It needs major work.  Let's start with the block/CFG properties you're looking for. We'll then have to go through each section of code and repeat the process.
>>
>> In fact I might recommend pulling the code which is trying to determine CFG properties into its own function.  Then try to come up with a good name for that function and a good function comment.
>>
>> THe next chunk of code you start looking at the properties of the candidate insn.  That seems like it ought to break out into its own function as well.
>>
>> THe process of refactoring, choosing good names and function comments should make the overall intent of this code clearer.
>>
>>
>>
> 
> I am working on this item and will incorporate them.
> 
> Thanks & Regards
> Ajit
>  
>> After that's done we'll review that work and perhaps go further.
>>
>> jeff
Jeff Law April 28, 2023, 10:10 p.m. UTC | #5
On 4/20/23 15:03, Ajit Agarwal wrote:

> 
> Currently I support AND with const1_rtx. This is what is equivalent to zero extension instruction in power instruction set. When you specify many other constants and Could you please specify what other constants needs to be supported and how to determine on the Input and output modes.
x AND <constant> will result in a zero-extended representation for a 
variety of constants, not just 1.  For example

For example x AND 3, x AND 7, x AND 15, etc.

If (const_int 1) is really that special here, then I've either 
completely misunderstood the intention of your patch or there's 
something quite special about the PPC port that I'm not aware of.

Jeff
Ajit Agarwal May 12, 2023, 11:18 a.m. UTC | #6
Hello Jeff:


On 29/04/23 3:40 am, Jeff Law wrote:
> 
> 
> On 4/20/23 15:03, Ajit Agarwal wrote:
> 
>>
>> Currently I support AND with const1_rtx. This is what is equivalent to zero extension instruction in power instruction set. When you specify many other constants and Could you please specify what other constants needs to be supported and how to determine on the Input and output modes.
> x AND <constant> will result in a zero-extended representation for a variety of constants, not just 1.  For example
> 
> For example x AND 3, x AND 7, x AND 15, etc.
> 
> If (const_int 1) is really that special here, then I've either completely misunderstood the intention of your patch or there's something quite special about the PPC port that I'm not aware of.
>

Here is the patch to address above.

	ree: Improve ree pass for rs6000 target

	For rs6000 target we see redundant zero and sign
	extension and done to improve ree pass to eliminate
	such redundant zero and sign extension. Support of
	AND with extension with different constants other
	than 1.

	2023-05-12  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>

gcc/ChangeLog:

	* ree.cc (rtx_is_zext_p): Add AND with varying contsants as
	extensions.
---
 gcc/ree.cc | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/gcc/ree.cc b/gcc/ree.cc
index 96fda1ac658..ddda5f194bb 100644
--- a/gcc/ree.cc
+++ b/gcc/ree.cc
@@ -269,8 +269,11 @@ rtx_is_zext_p (rtx insn)
       rtx set = XEXP (insn, 0);
       if (REG_P (set))
 	{
-	   if (XEXP (insn, 1) == const1_rtx)
-	     return true;
+	  rtx src = XEXP (insn, 1);
+
+	  if (CONST_INT_P (src)
+	      && IN_RANGE (exact_log2 (UINTVAL (src)), 0, 7))
+	    return true;
 	}
       else
 	return false;
@@ -297,9 +300,11 @@ rtx_is_zext_p (rtx_insn *insn)
 
      if (REG_P (set) && GET_MODE (SET_DEST (body)) == GET_MODE (set))
        {
-	 if (GET_MODE_UNIT_SIZE (GET_MODE (SET_DEST (body)))
-	     >= GET_MODE_UNIT_SIZE (GET_MODE (set)))
-	   return true;
+	  rtx src = XEXP (SET_SRC (body), 1);
+
+	  if (CONST_INT_P (src)
+	      && IN_RANGE (exact_log2 (UINTVAL (src)), 0, 7))
+	    return true;
        }
      else
       return false;
Ajit Agarwal May 31, 2023, 5:32 a.m. UTC | #7
Hello Jeff:

Please review Jeff.

Thanks & Regards
Ajit

On 12/05/23 4:48 pm, Ajit Agarwal via Gcc-patches wrote:
> Hello Jeff:
> 
> 
> On 29/04/23 3:40 am, Jeff Law wrote:
>>
>>
>> On 4/20/23 15:03, Ajit Agarwal wrote:
>>
>>>
>>> Currently I support AND with const1_rtx. This is what is equivalent to zero extension instruction in power instruction set. When you specify many other constants and Could you please specify what other constants needs to be supported and how to determine on the Input and output modes.
>> x AND <constant> will result in a zero-extended representation for a variety of constants, not just 1.  For example
>>
>> For example x AND 3, x AND 7, x AND 15, etc.
>>
>> If (const_int 1) is really that special here, then I've either completely misunderstood the intention of your patch or there's something quite special about the PPC port that I'm not aware of.
>>
> 
> Here is the patch to address above.
> 
> 	ree: Improve ree pass for rs6000 target
> 
> 	For rs6000 target we see redundant zero and sign
> 	extension and done to improve ree pass to eliminate
> 	such redundant zero and sign extension. Support of
> 	AND with extension with different constants other
> 	than 1.
> 
> 	2023-05-12  Ajit Kumar Agarwal  <aagarwa1@linux.ibm.com>
> 
> gcc/ChangeLog:
> 
> 	* ree.cc (rtx_is_zext_p): Add AND with varying contsants as
> 	extensions.
> ---
>  gcc/ree.cc | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/gcc/ree.cc b/gcc/ree.cc
> index 96fda1ac658..ddda5f194bb 100644
> --- a/gcc/ree.cc
> +++ b/gcc/ree.cc
> @@ -269,8 +269,11 @@ rtx_is_zext_p (rtx insn)
>        rtx set = XEXP (insn, 0);
>        if (REG_P (set))
>  	{
> -	   if (XEXP (insn, 1) == const1_rtx)
> -	     return true;
> +	  rtx src = XEXP (insn, 1);
> +
> +	  if (CONST_INT_P (src)
> +	      && IN_RANGE (exact_log2 (UINTVAL (src)), 0, 7))
> +	    return true;
>  	}
>        else
>  	return false;
> @@ -297,9 +300,11 @@ rtx_is_zext_p (rtx_insn *insn)
>  
>       if (REG_P (set) && GET_MODE (SET_DEST (body)) == GET_MODE (set))
>         {
> -	 if (GET_MODE_UNIT_SIZE (GET_MODE (SET_DEST (body)))
> -	     >= GET_MODE_UNIT_SIZE (GET_MODE (set)))
> -	   return true;
> +	  rtx src = XEXP (SET_SRC (body), 1);
> +
> +	  if (CONST_INT_P (src)
> +	      && IN_RANGE (exact_log2 (UINTVAL (src)), 0, 7))
> +	    return true;
>         }
>       else
>        return false;
diff mbox series

Patch

diff --git a/gcc/ree.cc b/gcc/ree.cc
index 413aec7c8eb..053db2e8ff3 100644
--- a/gcc/ree.cc
+++ b/gcc/ree.cc
@@ -253,6 +253,71 @@  struct ext_cand
 
 static int max_insn_uid;
 
+bool
+reg_used_set_between_p (rtx set, rtx_insn *def_insn, rtx_insn *insn)
+{
+  if (reg_used_between_p (set, def_insn, insn)
+      || reg_set_between_p (set, def_insn, insn))
+    return true;
+
+  return false;
+}
+
+/* Return TRUE if OP can be considered a zero extension from one or
+   more sub-word modes to larger modes up to a full word.
+
+   For example (and:DI (reg) (const_int X))
+
+   Depending on the value of X could be considered a zero extension
+   from QI, HI and SI to larger modes up to DImode.  */
+
+static unsigned int
+rtx_is_zext_p (rtx insn)
+{
+  if (GET_CODE (insn) == AND)
+    {
+      rtx set = XEXP (insn, 0);
+      if (REG_P (set))
+	{
+	  if (XEXP (insn, 1) == const1_rtx)
+	    return 1;
+	}
+      else
+	return 0;
+    }
+
+  return 0;
+}
+
+/* Return TRUE if OP can be considered a zero extension from one or
+   more sub-word modes to larger modes up to a full word.
+
+   For example (and:DI (reg) (const_int X))
+
+   Depending on the value of X could be considered a zero extension
+   from QI, HI and SI to larger modes up to DImode.  */
+
+static unsigned int
+rtx_is_zext_p (rtx_insn *insn)
+{
+  rtx body = single_set (insn);
+
+  if (GET_CODE (body) == SET && GET_CODE (SET_SRC (body)) == AND)
+   {
+     rtx set = XEXP (SET_SRC (body), 0);
+
+     if (REG_P (set) && GET_MODE (SET_DEST (body)) == GET_MODE (set))
+       {
+	 if (XEXP (SET_SRC (body), 1) == const1_rtx)
+	   return 1;
+       }
+     else
+      return 0;
+   }
+
+   return 0;
+}
+
 /* Update or remove REG_EQUAL or REG_EQUIV notes for INSN.  */
 
 static bool
@@ -319,7 +384,7 @@  combine_set_extension (ext_cand *cand, rtx_insn *curr_insn, rtx *orig_set)
 {
   rtx orig_src = SET_SRC (*orig_set);
   machine_mode orig_mode = GET_MODE (SET_DEST (*orig_set));
-  rtx new_set;
+  rtx new_set = NULL_RTX;
   rtx cand_pat = single_set (cand->insn);
 
   /* If the extension's source/destination registers are not the same
@@ -359,27 +424,41 @@  combine_set_extension (ext_cand *cand, rtx_insn *curr_insn, rtx *orig_set)
   else if (GET_CODE (orig_src) == cand->code)
     {
       /* Here is a sequence of two extensions.  Try to merge them.  */
-      rtx temp_extension
-	= gen_rtx_fmt_e (cand->code, cand->mode, XEXP (orig_src, 0));
+      rtx temp_extension = NULL_RTX;
+      if (GET_CODE (SET_SRC (cand_pat)) == AND)
+	temp_extension
+	= gen_rtx_AND (cand->mode, XEXP (orig_src, 0), XEXP (orig_src, 1));
+      else
+	temp_extension
+	 = gen_rtx_fmt_e (cand->code, cand->mode, XEXP (orig_src, 0));
       rtx simplified_temp_extension = simplify_rtx (temp_extension);
       if (simplified_temp_extension)
         temp_extension = simplified_temp_extension;
+
       new_set = gen_rtx_SET (new_reg, temp_extension);
     }
   else if (GET_CODE (orig_src) == IF_THEN_ELSE)
     {
       /* Only IF_THEN_ELSE of phi-type copies are combined.  Otherwise,
-         in general, IF_THEN_ELSE should not be combined.  */
-      return false;
+	in general, IF_THEN_ELSE should not be combined.  Relaxed
+	cases with IF_THEN_ELSE across basic blocls */
+	return true;
     }
   else
     {
       /* This is the normal case.  */
-      rtx temp_extension
+      rtx temp_extension = NULL_RTX;
+
+      if (GET_CODE (SET_SRC (cand_pat)) == AND)
+	temp_extension
+	= gen_rtx_AND (cand->mode, orig_src, XEXP (SET_SRC (cand_pat), 1));
+      else
+	temp_extension
 	= gen_rtx_fmt_e (cand->code, cand->mode, orig_src);
       rtx simplified_temp_extension = simplify_rtx (temp_extension);
       if (simplified_temp_extension)
         temp_extension = simplified_temp_extension;
+
       new_set = gen_rtx_SET (new_reg, temp_extension);
     }
 
@@ -468,12 +547,13 @@  get_defs (rtx_insn *insn, rtx reg, vec<rtx_insn *> *dest)
   FOR_EACH_INSN_USE (use, insn)
     {
       if (GET_CODE (DF_REF_REG (use)) == SUBREG)
-        return NULL;
+	return NULL;
       if (REGNO (DF_REF_REG (use)) == REGNO (reg))
 	break;
     }
 
-  gcc_assert (use != NULL);
+  if (use == NULL)
+    return NULL;
 
   ref_chain = DF_REF_CHAIN (use);
 
@@ -481,9 +561,9 @@  get_defs (rtx_insn *insn, rtx reg, vec<rtx_insn *> *dest)
     {
       /* Problem getting some definition for this instruction.  */
       if (ref_link->ref == NULL)
-        return NULL;
+	return NULL;
       if (DF_REF_INSN_INFO (ref_link->ref) == NULL)
-        return NULL;
+	return NULL;
       /* As global regs are assumed to be defined at each function call
 	 dataflow can report a call_insn as being a definition of REG.
 	 But we can't do anything with that in this pass so proceed only
@@ -500,7 +580,6 @@  get_defs (rtx_insn *insn, rtx reg, vec<rtx_insn *> *dest)
 
   return ref_chain;
 }
-
 /* Get all the reaching uses of an instruction.  The uses are desired for REG
    set in INSN.  Return use list or NULL if a use is missing or irregular.  */
 
@@ -698,6 +777,226 @@  get_sub_rtx (rtx_insn *def_insn)
   return sub_rtx;
 }
 
+/* Check if the def insn is ASHIFT and LSHIFTRT.
+  Inputs: insn for which def has to be checked.
+	  source operand rtx.
+   Output: True or false if def has arithmetic
+	   peration like ASHIFT and LSHIFTRT.  */
+
+static bool
+def_arith_p (rtx_insn *insn, rtx orig_src)
+{
+  if (!REG_P (orig_src))
+     return true;
+
+  vec<rtx_insn *> *dest = XCNEWVEC (vec<rtx_insn *>, 4);
+  if (!get_defs (insn, orig_src, dest))
+    return false;
+
+  int i;
+  rtx_insn *def_insn;
+  bool has_arith = false;
+
+  FOR_EACH_VEC_ELT (*dest, i, def_insn)
+    {
+      rtx def_set = single_set (def_insn);
+
+      if (!def_set)
+	{
+	  has_arith = true;
+	  break;
+	}
+
+      if (DEBUG_INSN_P (def_insn))
+	continue;
+
+      /* checks set (reg x), (ashift ( ...)
+		set (reg x), (lshiftrt (....)   */
+
+      if ((GET_CODE (PATTERN (def_insn)) == SET
+	   && (GET_CODE (SET_SRC (def_set)) == ASHIFT
+	   || GET_CODE (SET_SRC (def_set)) == LSHIFTRT)))
+	{
+	  has_arith = true;
+	  break;
+	}
+
+      /* checks set (reg x) , (plus(ashift ( ....)
+		set (reg x), (plus(lshiftrt (....)  */
+
+      if (GET_CODE (PATTERN (def_insn)) == SET
+	  && (GET_RTX_CLASS (GET_CODE (SET_SRC (def_set))) == RTX_BIN_ARITH
+	  || GET_RTX_CLASS (GET_CODE (SET_SRC (def_set))) == RTX_COMM_ARITH))
+	{
+	  rtx src = XEXP (SET_SRC (def_set),0);
+
+	  if (GET_CODE (src) == LSHIFTRT
+	      || GET_CODE (src) == ASHIFT)
+	    {
+	      has_arith = true;
+	      break;
+	    }
+	}
+     }
+  XDELETEVEC (dest);
+  return has_arith;
+}
+
+/* Find feasibility of extension elimination
+   across basic blocks.
+   Input: candiate to check the feasibility.
+	  def_insn of candidate.
+   Output: Returns true or false if feasible or not.  */
+
+static bool
+eliminate_across_bbs_p (ext_cand *cand, rtx_insn *def_insn)
+{
+  basic_block bb = BLOCK_FOR_INSN (cand->insn);
+  edge fallthru_edge;
+  edge e;
+  edge_iterator ei;
+
+  FOR_EACH_EDGE (e, ei, bb->preds)
+    {
+      rtx_insn *insn = BB_END (e->src) ? PREV_INSN (BB_END (e->src)) : NULL;
+
+      if (insn == NULL)
+	continue;
+
+      if (DEBUG_INSN_P (insn))
+	continue;
+
+      rtx set = single_set (insn);
+
+      if (insn && set
+	  && GET_CODE (set) == SET && SET_SRC (set)
+	  && GET_CODE (SET_SRC (set)) == IF_THEN_ELSE)
+	{
+	  if (e->dest == bb)
+	    {
+	      basic_block jump_block = e->dest;
+	      if (jump_block == bb)
+		{
+		  if (single_succ_p (BLOCK_FOR_INSN (def_insn)))
+		    {
+		      fallthru_edge = single_succ_edge (BLOCK_FOR_INSN (def_insn));
+		      if (BB_END (fallthru_edge->dest) && (bb != fallthru_edge->dest
+			|| e->dest != fallthru_edge->dest))
+			return false;
+		    }
+		   else
+		     return false;
+		}
+		else return false;
+	     }
+	     else
+	       {
+		 if (single_succ_p (e->dest))
+		   {
+		     fallthru_edge = single_succ_edge (e->dest);
+		     if (BB_END (fallthru_edge->dest) && bb != fallthru_edge->dest)
+		       return false;
+		   }
+		}
+	  }
+    }
+  if (single_succ_p (BLOCK_FOR_INSN (def_insn)))
+    {
+      fallthru_edge = single_succ_edge (BLOCK_FOR_INSN (def_insn));
+      if (BB_END (fallthru_edge->dest) && bb != fallthru_edge->dest)
+	return false;
+    }
+   else
+     return false;
+
+   rtx cand_set = single_set(cand->insn);
+   /* The destination register of the extension insn must not be
+	 used or set between the def_insn and cand->insn exclusive.  */
+   if (INSN_CHAIN_CODE_P (GET_CODE (def_insn))
+       && INSN_CHAIN_CODE_P (cand->code))
+     if ((cand->code == ZERO_EXTEND)
+	  && REG_P (SET_DEST (cand_set)) && NEXT_INSN (def_insn)
+	  && reg_used_set_between_p(SET_DEST (cand_set), def_insn, cand->insn))
+	return false;
+
+   if (cand->code == ZERO_EXTEND
+       && (bb != BLOCK_FOR_INSN (def_insn)
+       || DF_INSN_LUID (def_insn) > DF_INSN_LUID (cand->insn)))
+     return false;
+
+   if (rtx_is_zext_p (cand->insn))
+     {
+       if (GET_CODE (PATTERN (BB_END (bb))) != USE)
+	return false;
+
+       if (REGNO (XEXP (PATTERN (BB_END (bb)), 0)) != REGNO (SET_DEST (cand->expr)))
+	return false;
+     }
+
+   rtx set = single_set (def_insn);
+
+   if (!set)
+     return false;
+
+   if (cand->code == SIGN_EXTEND
+       && GET_CODE (set) == SET)
+     {
+	rtx orig_src = SET_SRC (set);
+	machine_mode ext_src_mode;
+
+	ext_src_mode = GET_MODE (XEXP (SET_SRC (cand->expr), 0));
+
+	if (GET_MODE (SET_DEST (set)) != ext_src_mode)
+	  return false;
+	if (GET_CODE (orig_src) != PLUS)
+	  return false;
+
+	if (!REG_P (XEXP (orig_src, 0)))
+	  return false;
+
+	if (!REG_P (XEXP (orig_src,1)))
+	  return false;
+
+	if (GET_CODE (orig_src) == PLUS)
+	  {
+	    bool def_src1
+	      = def_arith_p (def_insn,
+			     XEXP (SET_SRC (set), 0));
+	    bool def_src2
+	       = def_arith_p (def_insn,
+			     XEXP (SET_SRC (set), 1));
+
+	    if (def_src1 || def_src2)
+	      return false;
+	}
+     }
+
+   if (cand->code == ZERO_EXTEND
+	&& GET_CODE (set) == SET)
+     {
+	if (GET_CODE (SET_SRC (set)) != XOR
+	    && GET_CODE (SET_SRC (set)) != IOR)
+	  return false;
+
+
+	if (GET_CODE (SET_SRC (set)) == XOR
+	     || GET_CODE (SET_SRC (set)) == IOR)
+	  {
+	    bool def_src1
+	      = def_arith_p (def_insn,
+			     XEXP (SET_SRC (set), 0));
+	    bool def_src2
+	       = def_arith_p (def_insn,
+			     XEXP (SET_SRC (set), 1));
+
+	    if (def_src1 || def_src2)
+	      return false;
+	  }
+      }
+
+   return true;
+}
+
 /* Merge the DEF_INSN with an extension.  Calls combine_set_extension
    on the SET pattern.  */
 
@@ -713,12 +1012,32 @@  merge_def_and_ext (ext_cand *cand, rtx_insn *def_insn, ext_state *state)
   if (sub_rtx == NULL)
     return false;
 
-  if (GET_MODE (SET_DEST (*sub_rtx)) == ext_src_mode
-	  || ((state->modified[INSN_UID (def_insn)].kind
-	       == (cand->code == ZERO_EXTEND
+  bool copy_needed
+    = (REGNO (SET_DEST (cand->expr)) != REGNO (XEXP (SET_SRC (cand->expr), 0)));
+
+  bool feasible = eliminate_across_bbs_p (cand, def_insn);
+
+  if (!feasible) return false;
+
+  /* Combine zero_extend/sign_extend/AND and if sign_extend and
+     mode of DEST and SRC are different.  */
+
+  bool is_zext =  rtx_is_zext_p (cand->insn)
+		  || cand->code == ZERO_EXTEND
+		  || cand->code == SIGN_EXTEND;
+
+  bool do_elimination = !copy_needed
+			&& is_zext
+			&& (cand->code == SIGN_EXTEND
+			    || GET_MODE (SET_DEST (*sub_rtx)) != ext_src_mode);
+
+  if (((do_elimination
+	&& state->modified[INSN_UID (def_insn)].kind == EXT_MODIFIED_NONE))
+	|| ((state->modified[INSN_UID (def_insn)].kind
+		== (cand->code == ZERO_EXTEND
 		   ? EXT_MODIFIED_ZEXT : EXT_MODIFIED_SEXT))
-	      && state->modified[INSN_UID (def_insn)].mode
-		 == ext_src_mode))
+	     && state->modified[INSN_UID (def_insn)].mode
+		== ext_src_mode))
     {
       if (GET_MODE_UNIT_SIZE (GET_MODE (SET_DEST (*sub_rtx)))
 	  >= GET_MODE_UNIT_SIZE (cand->mode))
@@ -734,7 +1053,6 @@  merge_def_and_ext (ext_cand *cand, rtx_insn *def_insn, ext_state *state)
 	  return true;
 	}
     }
-
   return false;
 }
 
@@ -744,7 +1062,9 @@  merge_def_and_ext (ext_cand *cand, rtx_insn *def_insn, ext_state *state)
 static inline rtx
 get_extended_src_reg (rtx src)
 {
-  while (GET_CODE (src) == SIGN_EXTEND || GET_CODE (src) == ZERO_EXTEND)
+  while (GET_CODE (src) == SIGN_EXTEND
+	 || GET_CODE (src) == ZERO_EXTEND
+	 || rtx_is_zext_p (src))
     src = XEXP (src, 0);
   gcc_assert (REG_P (src));
   return src;
@@ -882,8 +1202,7 @@  combine_reaching_defs (ext_cand *cand, const_rtx set_pat, ext_state *state)
 
       /* The destination register of the extension insn must not be
 	 used or set between the def_insn and cand->insn exclusive.  */
-      if (reg_used_between_p (SET_DEST (set), def_insn, cand->insn)
-	  || reg_set_between_p (SET_DEST (set), def_insn, cand->insn))
+      if (reg_used_set_between_p (SET_DEST (set), def_insn, cand->insn))
 	return false;
 
       /* We must be able to copy between the two registers.   Generate,
@@ -975,10 +1294,8 @@  combine_reaching_defs (ext_cand *cand, const_rtx set_pat, ext_state *state)
 	     used or set between the def_insn2 and def_insn exclusive.
 	     Likewise for the other reg, i.e. check both reg1 and reg2
 	     in the above comment.  */
-	  if (reg_used_between_p (SET_DEST (set), def_insn2, def_insn)
-	      || reg_set_between_p (SET_DEST (set), def_insn2, def_insn)
-	      || reg_used_between_p (src_reg, def_insn2, def_insn)
-	      || reg_set_between_p (src_reg, def_insn2, def_insn))
+	  if (reg_used_set_between_p (SET_DEST (set), def_insn2, def_insn)
+	      || reg_used_set_between_p (src_reg, def_insn2, def_insn))
 	    break;
 
 	  state->defs_list[0] = def_insn2;
@@ -1004,15 +1321,17 @@  combine_reaching_defs (ext_cand *cand, const_rtx set_pat, ext_state *state)
       cand->mode = mode;
     }
 
-  merge_successful = true;
-
+  merge_successful = false;
   /* Go through the defs vector and try to merge all the definitions
      in this vector.  */
   state->modified_list.truncate (0);
   FOR_EACH_VEC_ELT (state->defs_list, defs_ix, def_insn)
     {
       if (merge_def_and_ext (cand, def_insn, state))
-	state->modified_list.safe_push (def_insn);
+	{
+	  merge_successful = true;
+	  state->modified_list.safe_push (def_insn);
+	}
       else
         {
           merge_successful = false;
@@ -1045,34 +1364,71 @@  combine_reaching_defs (ext_cand *cand, const_rtx set_pat, ext_state *state)
 	 definitions could be merged.  */
       if (apply_change_group ())
         {
-          if (dump_file)
-            fprintf (dump_file, "All merges were successful.\n");
+	  if (state->modified_list.length() == 0)
+	     return false;
+
+	  rtx_insn *insn = state->modified_list[0];
+
+	  if ((cand->code == ZERO_EXTEND || cand->code == SIGN_EXTEND)
+	       && GET_CODE (PATTERN (insn)) == SET
+	       && GET_CODE (SET_SRC (PATTERN (insn))) != XOR
+	       && GET_CODE (SET_SRC (PATTERN (insn))) != PLUS
+	       && GET_CODE (SET_SRC (PATTERN (insn))) != IOR)
+	     return false;
+
+	   if (dump_file)
+	     fprintf (dump_file, "All merges were successful.\n");
 
 	  FOR_EACH_VEC_ELT (state->modified_list, i, def_insn)
 	    {
 	      ext_modified *modified = &state->modified[INSN_UID (def_insn)];
 	      if (modified->kind == EXT_MODIFIED_NONE)
 		modified->kind = (cand->code == ZERO_EXTEND ? EXT_MODIFIED_ZEXT
-						            : EXT_MODIFIED_SEXT);
+							    : EXT_MODIFIED_SEXT);
 
 	      if (copy_needed)
 		modified->do_not_reextend = 1;
 	    }
           return true;
         }
-      else
-        {
-          /* Changes need not be cancelled explicitly as apply_change_group
-             does it.  Print list of definitions in the dump_file for debug
-             purposes.  This extension cannot be deleted.  */
-          if (dump_file)
-            {
-	      fprintf (dump_file,
-		       "Merge cancelled, non-mergeable definitions:\n");
-	      FOR_EACH_VEC_ELT (state->modified_list, i, def_insn)
-	        print_rtl_single (dump_file, def_insn);
-            }
-        }
+	else
+	  {
+	    if (state->modified_list.length() == 0)
+	      return false;
+
+	     rtx_insn *insn = state->modified_list[0];
+
+	     if ((cand->code == ZERO_EXTEND || cand->code == SIGN_EXTEND)
+		  && GET_CODE (PATTERN (insn)) == SET
+		  && GET_CODE (SET_SRC (PATTERN (insn))) != XOR
+		  && GET_CODE (SET_SRC (PATTERN (insn))) != PLUS
+		  && GET_CODE (SET_SRC (PATTERN (insn))) != IOR)
+		return false;
+
+	    if (cand->code == ZERO_EXTEND || cand->code == SIGN_EXTEND)
+	      {
+		FOR_EACH_VEC_ELT (state->modified_list, i, def_insn)
+		  {
+		    ext_modified *modified = &state->modified[INSN_UID (def_insn)];
+		    if (modified->kind == EXT_MODIFIED_NONE)
+		      modified->kind = (cand->code == ZERO_EXTEND ? EXT_MODIFIED_ZEXT
+								  : EXT_MODIFIED_SEXT);
+
+		     modified->do_not_reextend = 1;
+		   }
+		  return true;
+	      }
+	    /* Changes need not be cancelled explicitly as apply_change_group
+		does it.  Print list of definitions in the dump_file for debug
+		purposes.  This extension cannot be deleted.  */
+	    if (dump_file)
+	      {
+		fprintf (dump_file,
+			"Merge cancelled, non-mergeable definitions:\n");
+		FOR_EACH_VEC_ELT (state->modified_list, i, def_insn)
+		  print_rtl_single (dump_file, def_insn);
+	      }
+	   }
     }
   else
     {
@@ -1106,7 +1462,7 @@  add_removable_extension (const_rtx expr, rtx_insn *insn,
   mode = GET_MODE (dest);
 
   if (REG_P (dest)
-      && (code == SIGN_EXTEND || code == ZERO_EXTEND)
+      && (code == SIGN_EXTEND || code == ZERO_EXTEND || rtx_is_zext_p (src))
       && REG_P (XEXP (src, 0)))
     {
       rtx reg = XEXP (src, 0);
@@ -1125,7 +1481,7 @@  add_removable_extension (const_rtx expr, rtx_insn *insn,
 	      fprintf (dump_file, "Cannot eliminate extension:\n");
 	      print_rtl_single (dump_file, insn);
 	      fprintf (dump_file, " because it can operate on uninitialized"
-			          " data\n");
+				  " data\n");
 	    }
 	  return;
 	}
@@ -1321,7 +1677,7 @@  find_and_remove_re (void)
 	      && (REGNO (SET_DEST (set)) != REGNO (XEXP (SET_SRC (set), 0))))
 	    {
               reinsn_copy_list.safe_push (curr_cand->insn);
-              reinsn_copy_list.safe_push (state.defs_list[0]);
+	      reinsn_copy_list.safe_push (state.defs_list[0]);
 	    }
 	  reinsn_del_list.safe_push (curr_cand->insn);
 	  state.modified[INSN_UID (curr_cand->insn)].deleted = 1;
@@ -1368,9 +1724,10 @@  find_and_remove_re (void)
   reinsn_list.release ();
   XDELETEVEC (state.modified);
 
-  if (dump_file && num_re_opportunities > 0)
+  if (dump_file && num_re_opportunities > 0) {
     fprintf (dump_file, "Elimination opportunities = %d realized = %d\n",
 	     num_re_opportunities, num_realized);
+  }
 }
 
 /* Find and remove redundant extensions.  */
diff --git a/gcc/testsuite/g++.target/powerpc/sext-elim.C b/gcc/testsuite/g++.target/powerpc/sext-elim.C
new file mode 100644
index 00000000000..431696cf11e
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/sext-elim.C
@@ -0,0 +1,18 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2 -free" } */
+
+unsigned long c2l(unsigned char* p)
+{
+  unsigned long res = *p + *(p+1);
+  return res;
+}
+
+long c2sl(signed char* p)
+{
+  long res = *p + *(p+1);
+  return res;
+}
+
+/* { dg-final { scan-assembler-not "extsw" } } */
diff --git a/gcc/testsuite/g++.target/powerpc/zext-elim-1.C b/gcc/testsuite/g++.target/powerpc/zext-elim-1.C
new file mode 100644
index 00000000000..bc6cc0eb3ca
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/zext-elim-1.C
@@ -0,0 +1,19 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2 -free" } */
+
+extern unsigned char magic1[256];
+
+unsigned int hash(const unsigned char inp[4])
+{
+   const unsigned long long INIT = 0x1ULL;
+   unsigned long long h1 = INIT;
+   h1 = magic1[((unsigned long long)inp[0]) ^ h1];
+   h1 = magic1[((unsigned long long)inp[1]) ^ h1];
+   h1 = magic1[((unsigned long long)inp[2]) ^ h1];
+   h1 = magic1[((unsigned long long)inp[3]) ^ h1];
+   return h1;
+}
+
+/* { dg-final { scan-assembler-not "rlwinm" } } */
diff --git a/gcc/testsuite/g++.target/powerpc/zext-elim-2.C b/gcc/testsuite/g++.target/powerpc/zext-elim-2.C
new file mode 100644
index 00000000000..4e72925104f
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/zext-elim-2.C
@@ -0,0 +1,11 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2 -free" } */
+
+unsigned char g(unsigned char t[], unsigned char v)
+{
+  return (t[v & 0x7f] & 0x7f) | (v & 0x80);
+}
+
+/* { dg-final { scan-assembler-times "rlwinm" 2 } } */
diff --git a/gcc/testsuite/g++.target/powerpc/zext-elim.C b/gcc/testsuite/g++.target/powerpc/zext-elim.C
new file mode 100644
index 00000000000..56eabbe0c19
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/zext-elim.C
@@ -0,0 +1,30 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2 -free" } */
+
+#include <stddef.h>
+
+bool foo (int a, int b)
+{
+  if (a > 2)
+    return false;
+
+  if (b < 10)
+    return true;
+
+  return true;
+}
+
+int bar (int a, int b)
+{
+  if (a > 2)
+    return 0;
+
+  if (b < 10)
+    return 1;
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-not "rldicl" } } */