diff mbox

Regimplification enhancements 3/3

Message ID 20140617145425.GD29831@virgil.suse
State New
Headers show

Commit Message

Martin Jambor June 17, 2014, 2:54 p.m. UTC
On Mon, Jun 16, 2014 at 01:38:49PM +0200, Richard Biener wrote:
> On Mon, Jun 16, 2014 at 12:57 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> > There's code in regimplification that makes us use an extra temporary
> > when we encounter a call returning a non-BLKmode structure. This seems
> > somewhat inefficient and unnecessary, and when used from the
> > lower-addr-spaces pass I'm working on it leads to problems further
> > down that look like tree-ssa bugs that I wasn't able to clearly
> > disentangle.
> >
> > Here's what happens on compile/pr51761.c.  Regimplification has the
> > following effect, creating an extra temporary _6:
> >
> > -  D.1378 = fooD.1373 (aD.1377);
> > +  _6 = fooD.1373 (aD.1377);
> > +  # .MEMD.1382 = VDEF <.MEMD.1382>
> > +  D.1378 = _6;
> >
> > SRA turns this into:
> >
> >   _6 = fooD.1373 (aD.1377);
> >   # VUSE <.MEM_3>
> >   SR$2_7 = MEM[(struct S *)&_6];
> 
> clearly bogus - _6 is a register, you can't use a MEM on it.

Weird... does the following (untested) patch help?


It is just a quick thought though.  If it does not, could you post the
access trees dumped by -fdump-tree-esra-details or
-fdump-tree-sra-details (depending on whether this is early or late
SRA)?  Or is it simple to set it up locally?

Thanks,

Martin

> 
> > Somehow, the address of &_6 doesn't count as a use, and the DCE pass decides
> > it is unused:
> >
> >   Eliminating unnecessary statements:
> >   Deleting LHS of call: _6 = foo (a);
> >
> > However, the statement
> >   SR$2_7 = MEM[(struct S *)&_6];
> > is still present, and we have an SSA name without a definition, leading to a
> > crash.
> >
> > Rather than figure all this out, I decided to try making the
> > regimplification not generate the extra copy in the first place. The
> > testsuite seems to agree with me that it's unnecessary. Bootstrapped and
> > tested on x86_64-linux, ok?
> 
> Ok.  The code looks bogus anyway in that it generates a SSA name
> for sth not is_gimple_reg_type ().
> 
> Thanks,
> Richard.
> 
> >
> > Bernd

Comments

Bernd Schmidt June 30, 2014, 3:13 p.m. UTC | #1
On 06/17/2014 04:54 PM, Martin Jambor wrote:
> Weird... does the following (untested) patch help?
>
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index 0afa197..747b1b6 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -3277,6 +3277,8 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi)
>
>     if (modify_this_stmt
>         || gimple_has_volatile_ops (*stmt)
> +      || is_gimple_reg (lhs)
> +      || is_gimple_reg (rhs)
>         || contains_vce_or_bfcref_p (rhs)
>         || contains_vce_or_bfcref_p (lhs)
>         || stmt_ends_bb_p (*stmt))

Unfortunately not.

> It is just a quick thought though.  If it does not, could you post the
> access trees dumped by -fdump-tree-esra-details or
> -fdump-tree-sra-details (depending on whether this is early or late
> SRA)?  Or is it simple to set it up locally?

Not really. It needs a whole patch tree for the ptx port. I'm attaching 
the last two dump files.


Bernd
;; Function bar (bar, funcdef_no=0, decl_uid=1376, symbol_order=0)


Pass statistics:
----------------


Pass statistics:
----------------

bar (struct S xD.1375)
{
  struct S D.1385;
  struct S aD.1378;
  struct S D.1379;
  struct S D.1381;

;;   basic block 2, loop depth 0, count 0, freq 10000, maybe hot
;;    prev block 0, next block 1, flags: (NEW, REACHABLE)
;;    pred:       ENTRY [100.0%]  (FALLTHRU,EXECUTABLE)
  # .MEM_2 = VDEF <.MEM_1(D)>
  aD.1378 = xD.1375;
  # .MEM_3 = VDEF <.MEM_2>
  # USE = nonlocal 
  # CLB = nonlocal 
  _6 = fooD.1374 (aD.1378);
  # .MEM_7 = VDEF <.MEM_3>
  D.1379 = _6;
  # .MEM_4 = VDEF <.MEM_7>
  aD.1378 ={v} {CLOBBER};
  # .MEM_5 = VDEF <.MEM_4>
  D.1381 = D.1379;
  # VUSE <.MEM_5>
  return D.1381;
;;    succ:       EXIT [100.0%] 

}
;; Function bar (bar, funcdef_no=0, decl_uid=1376, symbol_order=0)


Pass statistics:
----------------

Candidate (1375): x
Candidate (1385): D.1385
Candidate (1378): a
Candidate (1379): D.1379
Candidate (1381): D.1381
Will attempt to totally scalarize D.1379 (UID: 1379): 
! Disqualifying D.1385 - No or inhibitingly overlapping accesses.
! Disqualifying x - No scalar replacements to be created.
! Disqualifying a - No scalar replacements to be created.
Created a replacement for D.1379 offset: 0, size: 32: SR$2

Access trees for D.1379 (UID: 1379): 
access { base = (1379)'D.1379', offset = 0, size = 32, expr = D.1379.len, type = unsigned int, grp_read = 1, grp_write = 1, grp_assignment_read = 1, grp_assignment_write = 1, grp_scalar_read = 1, grp_scalar_write = 0, grp_total_scalarization = 1, grp_hint = 1, grp_covered = 1, grp_unscalarizable_region = 0, grp_unscalarized_data = 0, grp_partial_lhs = 0, grp_to_be_replaced = 1, grp_to_be_debug_replaced = 0, grp_maybe_modified = 0, grp_not_necessarilly_dereferenced = 0

! Disqualifying D.1381 - No scalar replacements to be created.

Pass statistics:
----------------
Scalarized aggregates: 1
Modified expressions: 2
Separate LHS and RHS handling: 2
Scalar replacements created: 1


Updating SSA:
Registering new PHI nodes in block #0
Registering new PHI nodes in block #2
Updating SSA information for statement SR$2 = MEM[(struct S *)&_6];
Updating SSA information for statement MEM[(struct S *)&D.1381] = SR$2;

DFA Statistics for bar

---------------------------------------------------------
                                Number of        Memory
                                instances         used 
---------------------------------------------------------
USE operands                              1          8b
DEF operands                              2         16b
VUSE operands                             6         48b
VDEF operands                             4         32b
PHI nodes                                 0          0b
PHI arguments                             0          0b
---------------------------------------------------------
Total memory used by DFA/SSA data                  104b
---------------------------------------------------------



Hash table statistics:
    var_infos:   size 61, 1 elements, 0.000000 collision/search ratio


Symbols to be put in SSA form
{ D.1387 }
Incremental SSA update started at block: 0
Number of blocks in CFG: 3
Number of blocks to update: 2 ( 67%)
Affected blocks: 0 2
Martin Jambor July 24, 2014, 12:38 p.m. UTC | #2
Hi,

sorry for late reply, I've been on vacation and then preparing for
Cauldron.  Anyway...

On Mon, Jun 30, 2014 at 05:13:13PM +0200, Bernd Schmidt wrote:
> On 06/17/2014 04:54 PM, Martin Jambor wrote:
> >Weird... does the following (untested) patch help?
> >
> >diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> >index 0afa197..747b1b6 100644
> >--- a/gcc/tree-sra.c
> >+++ b/gcc/tree-sra.c
> >@@ -3277,6 +3277,8 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi)
> >
> >    if (modify_this_stmt
> >        || gimple_has_volatile_ops (*stmt)
> >+      || is_gimple_reg (lhs)
> >+      || is_gimple_reg (rhs)
> >        || contains_vce_or_bfcref_p (rhs)
> >        || contains_vce_or_bfcref_p (lhs)
> >        || stmt_ends_bb_p (*stmt))
> 
> Unfortunately not.
> 
> >It is just a quick thought though.  If it does not, could you post the
> >access trees dumped by -fdump-tree-esra-details or
> >-fdump-tree-sra-details (depending on whether this is early or late
> >SRA)?  Or is it simple to set it up locally?
> 
> Not really. It needs a whole patch tree for the ptx port. I'm
> attaching the last two dump files.
> 
> 
> Bernd
> 

> 
> ;; Function bar (bar, funcdef_no=0, decl_uid=1376, symbol_order=0)
> 
> 
> Pass statistics:
> ----------------
> 
> 
> Pass statistics:
> ----------------
> 
> bar (struct S xD.1375)
> {
>   struct S D.1385;
>   struct S aD.1378;
>   struct S D.1379;
>   struct S D.1381;
> 
> ;;   basic block 2, loop depth 0, count 0, freq 10000, maybe hot
> ;;    prev block 0, next block 1, flags: (NEW, REACHABLE)
> ;;    pred:       ENTRY [100.0%]  (FALLTHRU,EXECUTABLE)
>   # .MEM_2 = VDEF <.MEM_1(D)>
>   aD.1378 = xD.1375;
>   # .MEM_3 = VDEF <.MEM_2>
>   # USE = nonlocal 
>   # CLB = nonlocal 
>   _6 = fooD.1374 (aD.1378);
>   # .MEM_7 = VDEF <.MEM_3>
>   D.1379 = _6;

This seems to be the statement which has its RHS converted to to a
MEM_REF[&_6], am I right?  I wonder whether it is correct input
though, because it looks like it has mismatched types.  The LHS is
clearly an aggregate of type struct S while the RHS is an SSA name,
meaning it cannot be of an aggregate type.  Does this pass gimple
checking?  What creates that statement?

Thanks,

Martin


>   # .MEM_4 = VDEF <.MEM_7>
>   aD.1378 ={v} {CLOBBER};
>   # .MEM_5 = VDEF <.MEM_4>
>   D.1381 = D.1379;
>   # VUSE <.MEM_5>
>   return D.1381;
> ;;    succ:       EXIT [100.0%] 
> 
> }
> 
> 

> 
> ;; Function bar (bar, funcdef_no=0, decl_uid=1376, symbol_order=0)
> 
> 
> Pass statistics:
> ----------------
> 
> Candidate (1375): x
> Candidate (1385): D.1385
> Candidate (1378): a
> Candidate (1379): D.1379
> Candidate (1381): D.1381
> Will attempt to totally scalarize D.1379 (UID: 1379): 
> ! Disqualifying D.1385 - No or inhibitingly overlapping accesses.
> ! Disqualifying x - No scalar replacements to be created.
> ! Disqualifying a - No scalar replacements to be created.
> Created a replacement for D.1379 offset: 0, size: 32: SR$2
> 
> Access trees for D.1379 (UID: 1379): 
> access { base = (1379)'D.1379', offset = 0, size = 32, expr = D.1379.len, type = unsigned int, grp_read = 1, grp_write = 1, grp_assignment_read = 1, grp_assignment_write = 1, grp_scalar_read = 1, grp_scalar_write = 0, grp_total_scalarization = 1, grp_hint = 1, grp_covered = 1, grp_unscalarizable_region = 0, grp_unscalarized_data = 0, grp_partial_lhs = 0, grp_to_be_replaced = 1, grp_to_be_debug_replaced = 0, grp_maybe_modified = 0, grp_not_necessarilly_dereferenced = 0
> 
> ! Disqualifying D.1381 - No scalar replacements to be created.
> 
> Pass statistics:
> ----------------
> Scalarized aggregates: 1
> Modified expressions: 2
> Separate LHS and RHS handling: 2
> Scalar replacements created: 1
> 
> 
> Updating SSA:
> Registering new PHI nodes in block #0
> Registering new PHI nodes in block #2
> Updating SSA information for statement SR$2 = MEM[(struct S *)&_6];
> Updating SSA information for statement MEM[(struct S *)&D.1381] = SR$2;
> 
> DFA Statistics for bar
> 
> ---------------------------------------------------------
>                                 Number of        Memory
>                                 instances         used 
> ---------------------------------------------------------
> USE operands                              1          8b
> DEF operands                              2         16b
> VUSE operands                             6         48b
> VDEF operands                             4         32b
> PHI nodes                                 0          0b
> PHI arguments                             0          0b
> ---------------------------------------------------------
> Total memory used by DFA/SSA data                  104b
> ---------------------------------------------------------
> 
> 
> 
> Hash table statistics:
>     var_infos:   size 61, 1 elements, 0.000000 collision/search ratio
> 
> 
> Symbols to be put in SSA form
> { D.1387 }
> Incremental SSA update started at block: 0
> Number of blocks in CFG: 3
> Number of blocks to update: 2 ( 67%)
> Affected blocks: 0 2
> 
>
Richard Biener July 24, 2014, 12:53 p.m. UTC | #3
On Thu, Jul 24, 2014 at 2:38 PM, Martin Jambor <mjambor@suse.cz> wrote:
> Hi,
>
> sorry for late reply, I've been on vacation and then preparing for
> Cauldron.  Anyway...
>
> On Mon, Jun 30, 2014 at 05:13:13PM +0200, Bernd Schmidt wrote:
>> On 06/17/2014 04:54 PM, Martin Jambor wrote:
>> >Weird... does the following (untested) patch help?
>> >
>> >diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
>> >index 0afa197..747b1b6 100644
>> >--- a/gcc/tree-sra.c
>> >+++ b/gcc/tree-sra.c
>> >@@ -3277,6 +3277,8 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi)
>> >
>> >    if (modify_this_stmt
>> >        || gimple_has_volatile_ops (*stmt)
>> >+      || is_gimple_reg (lhs)
>> >+      || is_gimple_reg (rhs)
>> >        || contains_vce_or_bfcref_p (rhs)
>> >        || contains_vce_or_bfcref_p (lhs)
>> >        || stmt_ends_bb_p (*stmt))
>>
>> Unfortunately not.
>>
>> >It is just a quick thought though.  If it does not, could you post the
>> >access trees dumped by -fdump-tree-esra-details or
>> >-fdump-tree-sra-details (depending on whether this is early or late
>> >SRA)?  Or is it simple to set it up locally?
>>
>> Not really. It needs a whole patch tree for the ptx port. I'm
>> attaching the last two dump files.
>>
>>
>> Bernd
>>
>
>>
>> ;; Function bar (bar, funcdef_no=0, decl_uid=1376, symbol_order=0)
>>
>>
>> Pass statistics:
>> ----------------
>>
>>
>> Pass statistics:
>> ----------------
>>
>> bar (struct S xD.1375)
>> {
>>   struct S D.1385;
>>   struct S aD.1378;
>>   struct S D.1379;
>>   struct S D.1381;
>>
>> ;;   basic block 2, loop depth 0, count 0, freq 10000, maybe hot
>> ;;    prev block 0, next block 1, flags: (NEW, REACHABLE)
>> ;;    pred:       ENTRY [100.0%]  (FALLTHRU,EXECUTABLE)
>>   # .MEM_2 = VDEF <.MEM_1(D)>
>>   aD.1378 = xD.1375;
>>   # .MEM_3 = VDEF <.MEM_2>
>>   # USE = nonlocal
>>   # CLB = nonlocal
>>   _6 = fooD.1374 (aD.1378);
>>   # .MEM_7 = VDEF <.MEM_3>
>>   D.1379 = _6;
>
> This seems to be the statement which has its RHS converted to to a
> MEM_REF[&_6], am I right?  I wonder whether it is correct input
> though, because it looks like it has mismatched types.  The LHS is
> clearly an aggregate of type struct S while the RHS is an SSA name,
> meaning it cannot be of an aggregate type.  Does this pass gimple
> checking?  What creates that statement?

Yeah, looks clearly invalid.  MEM_REF[&_6] is not valid even if
the types were correct (taking the address of an SSA name).

Richard.

> Thanks,
>
> Martin
>
>
>>   # .MEM_4 = VDEF <.MEM_7>
>>   aD.1378 ={v} {CLOBBER};
>>   # .MEM_5 = VDEF <.MEM_4>
>>   D.1381 = D.1379;
>>   # VUSE <.MEM_5>
>>   return D.1381;
>> ;;    succ:       EXIT [100.0%]
>>
>> }
>>
>>
>
>>
>> ;; Function bar (bar, funcdef_no=0, decl_uid=1376, symbol_order=0)
>>
>>
>> Pass statistics:
>> ----------------
>>
>> Candidate (1375): x
>> Candidate (1385): D.1385
>> Candidate (1378): a
>> Candidate (1379): D.1379
>> Candidate (1381): D.1381
>> Will attempt to totally scalarize D.1379 (UID: 1379):
>> ! Disqualifying D.1385 - No or inhibitingly overlapping accesses.
>> ! Disqualifying x - No scalar replacements to be created.
>> ! Disqualifying a - No scalar replacements to be created.
>> Created a replacement for D.1379 offset: 0, size: 32: SR$2
>>
>> Access trees for D.1379 (UID: 1379):
>> access { base = (1379)'D.1379', offset = 0, size = 32, expr = D.1379.len, type = unsigned int, grp_read = 1, grp_write = 1, grp_assignment_read = 1, grp_assignment_write = 1, grp_scalar_read = 1, grp_scalar_write = 0, grp_total_scalarization = 1, grp_hint = 1, grp_covered = 1, grp_unscalarizable_region = 0, grp_unscalarized_data = 0, grp_partial_lhs = 0, grp_to_be_replaced = 1, grp_to_be_debug_replaced = 0, grp_maybe_modified = 0, grp_not_necessarilly_dereferenced = 0
>>
>> ! Disqualifying D.1381 - No scalar replacements to be created.
>>
>> Pass statistics:
>> ----------------
>> Scalarized aggregates: 1
>> Modified expressions: 2
>> Separate LHS and RHS handling: 2
>> Scalar replacements created: 1
>>
>>
>> Updating SSA:
>> Registering new PHI nodes in block #0
>> Registering new PHI nodes in block #2
>> Updating SSA information for statement SR$2 = MEM[(struct S *)&_6];
>> Updating SSA information for statement MEM[(struct S *)&D.1381] = SR$2;
>>
>> DFA Statistics for bar
>>
>> ---------------------------------------------------------
>>                                 Number of        Memory
>>                                 instances         used
>> ---------------------------------------------------------
>> USE operands                              1          8b
>> DEF operands                              2         16b
>> VUSE operands                             6         48b
>> VDEF operands                             4         32b
>> PHI nodes                                 0          0b
>> PHI arguments                             0          0b
>> ---------------------------------------------------------
>> Total memory used by DFA/SSA data                  104b
>> ---------------------------------------------------------
>>
>>
>>
>> Hash table statistics:
>>     var_infos:   size 61, 1 elements, 0.000000 collision/search ratio
>>
>>
>> Symbols to be put in SSA form
>> { D.1387 }
>> Incremental SSA update started at block: 0
>> Number of blocks in CFG: 3
>> Number of blocks to update: 2 ( 67%)
>> Affected blocks: 0 2
>>
>>
>
Bernd Schmidt July 24, 2014, 1:19 p.m. UTC | #4
On 07/24/2014 02:38 PM, Martin Jambor wrote:
> This seems to be the statement which has its RHS converted to to a
> MEM_REF[&_6], am I right?  I wonder whether it is correct input
> though, because it looks like it has mismatched types.  The LHS is
> clearly an aggregate of type struct S while the RHS is an SSA name,
> meaning it cannot be of an aggregate type.  Does this pass gimple
> checking?  What creates that statement?

The code in gimplify-me which I was proposing to remove. I guess I'll 
just commit that patch.


Bernd
diff mbox

Patch

diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 0afa197..747b1b6 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -3277,6 +3277,8 @@  sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi)
 
   if (modify_this_stmt
       || gimple_has_volatile_ops (*stmt)
+      || is_gimple_reg (lhs)
+      || is_gimple_reg (rhs)
       || contains_vce_or_bfcref_p (rhs)
       || contains_vce_or_bfcref_p (lhs)
       || stmt_ends_bb_p (*stmt))