Message ID | 14904386.2vhstX5eeo@polaris |
---|---|
State | New |
Headers | show |
Series | Fix wrong code with small structure return on PowerPC | expand |
> this is a bug originally reported in Ada on 32-bit PowerPC with the SVR4 ABI > (hence not Linux) and reproducible in C with the attached testcase at -O1. With the right C testcase this time...
On September 11, 2017 12:26:09 PM GMT+02:00, Eric Botcazou <ebotcazou@adacore.com> wrote: >Hi, > >this is a bug originally reported in Ada on 32-bit PowerPC with the >SVR4 ABI >(hence not Linux) and reproducible in C with the attached testcase at >-O1. > >The problem lies in function 'foo': > > struct S ret; > char r, s, c1, c2; > char *p = &r; > > s = bar (&p); > if (s) > c2 = *p; > c1 = 0; > > ret.c1 = c1; > ret.c2 = c2; > return ret; > >Since the call to bar returns 0, c2 is uninitialized at run time (but >this >doesn't matter for correctness since its value is never read). Now >both c1 >and c2 are represented at the RTL level by unsigned promoted subregs, >i.e. >SUBREGs verifying SUBREG_PROMOTED_VAR_P and SUBREG_PROMOTED_UNSIGNED_P > >As a consequence, when store_fixed_bit_field_1 stores the 8-bit values >into >the 'ret' object, it considers that the underlying 32-bit objects have >only 8 >bits set and the rest cleared so it doesn't mask the other 24 bits. >That's >OK for c1 but not for c2, since c2 is uninitialized so the bits are >random. > >This appears to be an inherent weakness of the promoted subregs >mechanism, but >I don't think that we want to go for an upheaval at this point. So the >patch >addresses the problem in an ad-hoc manner directly in >store_fixed_bit_field_1 >and yields the expected 8-bit insertion: > >@@ -26,7 +26,7 @@ > lwz 9,12(1) > lbz 31,0(9) > .L3: >- slwi 3,31,16 >+ rlwinm 3,31,16,8,15 > lwz 0,36(1) > mtlr 0 > lwz 31,28(1) > >Tested on x86-64/Linux and PowerPC64/Linux, OK for the mainline? I think the issue is that we set SUBREG_PROMOTED_* on something that is possibly not so (aka uninitialized in this case). We may only set it if either the ABI or a previous operation forced it to. Maybe this is also the reason why we need this zero init pass in some cases (though that isn't a full solution either). Richard. > >2017-09-11 Eric Botcazou <ebotcazou@adacore.com> > > * expmed.c (store_fixed_bit_field_1): Force the masking if the value > is an unsigned promoted SUBREG of a sufficiently large object. > > >2017-09-11 Eric Botcazou <ebotcazou@adacore.com> > > * gcc.c-torture/execute/20170911-1.c: New test.
> I think the issue is that we set SUBREG_PROMOTED_* on something that is > possibly not so (aka uninitialized in this case). Yes, that's what I called inherent weakness of the promoted subregs mechanism. > We may only set it if either the ABI or a previous operation forced it to. > Maybe this is also the reason why we need this zero init pass in some cases > (though that isn't a full solution either). Do you think that we should zero-init all the unsigned promoted subregs (and sign-extend-init all the signed promoted subregs)? That sounds like a big hammer to me, but I can give it a try.
On Mon, Sep 11, 2017 at 9:59 PM, Eric Botcazou <ebotcazou@adacore.com> wrote: >> I think the issue is that we set SUBREG_PROMOTED_* on something that is >> possibly not so (aka uninitialized in this case). > > Yes, that's what I called inherent weakness of the promoted subregs mechanism. > >> We may only set it if either the ABI or a previous operation forced it to. >> Maybe this is also the reason why we need this zero init pass in some cases >> (though that isn't a full solution either). > > Do you think that we should zero-init all the unsigned promoted subregs (and > sign-extend-init all the signed promoted subregs)? That sounds like a big > hammer to me, but I can give it a try. My suggestion would be to not set SUBREG_PROMOTED_* on "everything" (which we seem to do). zero-initing looks like the easier way to deal with the situation though. ISTR SUBREG_PROMOTED_* gets set by RTL expansion, the problematic one should be the one in expr.c setting it on all SSA_NAMEs, maybe it is enough to avoid setting it for SSA_NAME_IS_DEFAULT_DEF ones? Richard. > -- > Eric Botcazou
> ISTR SUBREG_PROMOTED_* gets set by RTL expansion, the problematic > one should be the one in expr.c setting it on all SSA_NAMEs, maybe it is > enough to avoid setting it for SSA_NAME_IS_DEFAULT_DEF ones? No, it's not enough in this case, you need to work a little harder and look at the arguments of a PHI node because the SSA_NAME_IS_DEFAULT_DEF name is never materialized in the RTL; so the attached patch works in this case. * expr.c (expand_expr_real_1) <expand_decl_rtl>: For a SSA_NAME, do not set SUBREG_PROMOTED_VAR_P on the sub-register if it may contain uninitialized bits.
On Mon, Sep 25, 2017 at 11:35 AM, Eric Botcazou <ebotcazou@adacore.com> wrote: >> ISTR SUBREG_PROMOTED_* gets set by RTL expansion, the problematic >> one should be the one in expr.c setting it on all SSA_NAMEs, maybe it is >> enough to avoid setting it for SSA_NAME_IS_DEFAULT_DEF ones? > > No, it's not enough in this case, you need to work a little harder and look at > the arguments of a PHI node because the SSA_NAME_IS_DEFAULT_DEF name is never > materialized in the RTL; so the attached patch works in this case. > > > * expr.c (expand_expr_real_1) <expand_decl_rtl>: For a SSA_NAME, do > not set SUBREG_PROMOTED_VAR_P on the sub-register if it may contain > uninitialized bits. Reading the patch I think that it gets conservativeness wrong -- shouldn't it be if (is_definitely_initialized) { SUBREG_PROMOTED_VAR_P (temp) = 1; SUBREG_PROMOTED_SET (temp, unsignedp); } ? Of course it's not easy to compute is_definitely_initialized conservatively in an ad-hoc way at this place. It should be relatively straight-forward to do a conservative computation somewhere in cfgexpand.c by propagating across SSA edges and recording a flag on SSA names though. I assume we can take load destinations as fully initialized (should extend properly) as well as call results (the ABI should extend, eventually we can query the target if it does), likewise for function arguments. On your patch: + /* Try to detect if the register contains uninitialized bits. */ + if (SSA_NAME_IS_DEFAULT_DEF (ssa_name)) + maybe_uninitialized = true; if you use ssa_undefined_value_p (ssa_name[, true]) you'd get function paramters not undefined (which is probably desired?). Likewise partial initialized complex would get uninitialized (if passed , true). Same inside the loop over PHI args though I wonder how pessimizing it would be to simply do if (ssa_undefined_value_p (ssa_name, true) || gimple_code (SSA_NAME_DEF_STMT (ssa_name)) == GIMPLE_PHI) maybe_uninitialized = true; thus make all PHIs possibly uninitialized (your code wouldn't catch a chained PHI with undef arg). As said, a better solution would be to do a definitely-initialized mini-propagation at RTL expansion time. I don't stand in the way of "improving" things with your patch (besides the ssa_undefined_value_p use). Thanks, Richard. > -- > Eric Botcazou
> Reading the patch I think that it gets conservativeness wrong -- shouldn't > it be > > if (is_definitely_initialized) > { > SUBREG_PROMOTED_VAR_P (temp) = 1; > SUBREG_PROMOTED_SET (temp, unsignedp); > } > > ? Of course it's not easy to compute is_definitely_initialized > conservatively in an ad-hoc way at this place. It should be relatively > straight-forward to do a conservative computation somewhere in cfgexpand.c > by propagating across SSA edges and recording a flag on SSA names though. > I assume we can take load destinations as fully initialized (should extend > properly) as well as call results (the ABI should extend, eventually we can > query the target if it does), likewise for function arguments. Yes, that's why the comment read "Try to detect if " and not "Detect if ". > On your patch: > > + /* Try to detect if the register contains uninitialized bits. > */ + if (SSA_NAME_IS_DEFAULT_DEF (ssa_name)) > + maybe_uninitialized = true; > > if you use ssa_undefined_value_p (ssa_name[, true]) you'd get function > paramters not undefined (which is probably desired?). Likewise > partial initialized complex would get uninitialized (if passed , true). Ah, yes, I overlooked that. > Same inside the loop over PHI args though I wonder how pessimizing > it would be to simply do > > if (ssa_undefined_value_p (ssa_name, true) > > || gimple_code (SSA_NAME_DEF_STMT (ssa_name)) == GIMPLE_PHI) > > maybe_uninitialized = true; > > thus make all PHIs possibly uninitialized (your code wouldn't catch a > chained PHI with undef arg). Too big a hammer for such a very rare bug I think. > As said, a better solution would be to do a definitely-initialized > mini-propagation at RTL expansion time. I'm not sure if we really need to propagate. What about the attached patch? It computes at expansion time whether the partition the SSA name is a member of contains an undefined value and, if so, doesn't set the promoted bit for the SUBREG. My gut feeling is that it's sufficient in practice. * tree-outof-ssa.h (ssaexpand): Add partitions_for_undefined_values. (always_initialized_rtx_for_ssa_name_p): New predicate. * tree-outof-ssa.c (remove_ssa_form): Initialize new field of SA. (finish_out_of_ssa): Free new field of SA. * tree-ssa-coalesce.h (get_undefined_value_partitions): Declare. * tree-ssa-coalesce.c: Include tree-ssa.h. (get_parm_default_def_partitions): Remove extern keyword. (get_undefined_value_partitions): New function. * expr.c (expand_expr_real_1) <expand_decl_rtl>: For a SSA_NAME, do not set SUBREG_PROMOTED_VAR_P on the sub-register if it may contain uninitialized bits.
On Tue, Oct 3, 2017 at 8:39 PM, Eric Botcazou <ebotcazou@adacore.com> wrote: >> Reading the patch I think that it gets conservativeness wrong -- shouldn't >> it be >> >> if (is_definitely_initialized) >> { >> SUBREG_PROMOTED_VAR_P (temp) = 1; >> SUBREG_PROMOTED_SET (temp, unsignedp); >> } >> >> ? Of course it's not easy to compute is_definitely_initialized >> conservatively in an ad-hoc way at this place. It should be relatively >> straight-forward to do a conservative computation somewhere in cfgexpand.c >> by propagating across SSA edges and recording a flag on SSA names though. >> I assume we can take load destinations as fully initialized (should extend >> properly) as well as call results (the ABI should extend, eventually we can >> query the target if it does), likewise for function arguments. > > Yes, that's why the comment read "Try to detect if " and not "Detect if ". > >> On your patch: >> >> + /* Try to detect if the register contains uninitialized bits. >> */ + if (SSA_NAME_IS_DEFAULT_DEF (ssa_name)) >> + maybe_uninitialized = true; >> >> if you use ssa_undefined_value_p (ssa_name[, true]) you'd get function >> paramters not undefined (which is probably desired?). Likewise >> partial initialized complex would get uninitialized (if passed , true). > > Ah, yes, I overlooked that. > >> Same inside the loop over PHI args though I wonder how pessimizing >> it would be to simply do >> >> if (ssa_undefined_value_p (ssa_name, true) >> >> || gimple_code (SSA_NAME_DEF_STMT (ssa_name)) == GIMPLE_PHI) >> >> maybe_uninitialized = true; >> >> thus make all PHIs possibly uninitialized (your code wouldn't catch a >> chained PHI with undef arg). > > Too big a hammer for such a very rare bug I think. > >> As said, a better solution would be to do a definitely-initialized >> mini-propagation at RTL expansion time. > > I'm not sure if we really need to propagate. What about the attached patch? > It computes at expansion time whether the partition the SSA name is a member > of contains an undefined value and, if so, doesn't set the promoted bit for > the SUBREG. My gut feeling is that it's sufficient in practice. I'll take your word for it ;) The patch is ok. Thanks, Richard. > > * tree-outof-ssa.h (ssaexpand): Add partitions_for_undefined_values. > (always_initialized_rtx_for_ssa_name_p): New predicate. > * tree-outof-ssa.c (remove_ssa_form): Initialize new field of SA. > (finish_out_of_ssa): Free new field of SA. > * tree-ssa-coalesce.h (get_undefined_value_partitions): Declare. > * tree-ssa-coalesce.c: Include tree-ssa.h. > (get_parm_default_def_partitions): Remove extern keyword. > (get_undefined_value_partitions): New function. > * expr.c (expand_expr_real_1) <expand_decl_rtl>: For a SSA_NAME, do > not set SUBREG_PROMOTED_VAR_P on the sub-register if it may contain > uninitialized bits. > > -- > Eric Botcazou
> I'll take your word for it ;)
Thanks. Testing on PowerPC64 revealed a couple of nits:
1. SSA names with zero uses need to be excluded from the computation, because
the first SSA name in a function returning a GIMPLE type is associated with
the RESULT_DECL and is undefined, so it would propagate the undefinedness to
every SSA name collapsed with the RESULT_DECL, i.e. unduly pessimization.
2. Removing the SUBREG_PROMOTED_VAR_P flag disables the trick present in
expand_gimple_stmt_1 for the LHS of assignment statements, so you end up with
(more) SUBREGs on the LHS of moves in RTL, hence the fixlet for loop-iv.c.
I bootstrapped/regtested it on x86-64, SPARC64, Aarch64 and PowerPC64/Linux,
and compared the code generated for the tests in gcc.c-torture/compile at -O2:
very few changes for the first 3, but around 30 tests changed for PowerPC64,
all with uninitialized variables AFAICS.
Applied on the mainline.
2017-10-08 Eric Botcazou <ebotcazou@adacore.com>
* tree-outof-ssa.h (ssaexpand): Add partitions_for_undefined_values.
(always_initialized_rtx_for_ssa_name_p): New predicate.
* tree-outof-ssa.c (remove_ssa_form): Initialize new field of SA.
(finish_out_of_ssa): Free new field of SA.
* tree-ssa-coalesce.h (get_undefined_value_partitions): Declare.
* tree-ssa-coalesce.c: Include tree-ssa.h.
(get_parm_default_def_partitions): Remove extern keyword.
(get_undefined_value_partitions): New function.
* expr.c (expand_expr_real_1) <expand_decl_rtl>: For a SSA_NAME, do
not set SUBREG_PROMOTED_VAR_P on the sub-register if it may contain
uninitialized bits.
* loop-iv.c (iv_get_reaching_def): Disqualify all subregs.
2017-10-08 Eric Botcazou <ebotcazou@adacore.com>
* gcc.c-torture/execute/20171008-1.c: New test.
Index: expmed.c =================================================================== --- expmed.c (revision 251906) +++ expmed.c (working copy) @@ -1213,13 +1213,25 @@ store_fixed_bit_field_1 (rtx op0, scalar } else { - int must_and = (GET_MODE_BITSIZE (value_mode) != bitsize - && bitnum + bitsize != GET_MODE_BITSIZE (mode)); + /* Compute whether we must mask the value in order to make sure that the + bits outside the bitfield are all cleared. Note that, even though the + value has the right size, if it has a mode different from MODE, then + converting it to MODE may unmask uninitialized bits in the case where + it is an unsigned promoted SUBREG of a sufficiently large object. */ + const bool must_mask + = (GET_MODE_BITSIZE (value_mode) != bitsize + || (value_mode != mode + && GET_CODE (value) == SUBREG + && SUBREG_PROMOTED_VAR_P (value) + && SUBREG_PROMOTED_UNSIGNED_P (value) + && GET_MODE_SIZE (GET_MODE (SUBREG_REG (value))) + >= GET_MODE_SIZE (mode))) + && bitnum + bitsize != GET_MODE_BITSIZE (mode); if (value_mode != mode) value = convert_to_mode (mode, value, 1); - if (must_and) + if (must_mask) value = expand_binop (mode, and_optab, value, mask_rtx (mode, 0, bitsize, 0), NULL_RTX, 1, OPTAB_LIB_WIDEN);