diff mbox

[C++] 79118 bitfields & constexpr

Message ID 9dc01f9e-6d84-0e30-0dac-56b52892042f@acm.org
State New
Headers show

Commit Message

Nathan Sidwell Jan. 23, 2017, 1:49 p.m. UTC
This patch addresses 79118, where we ICE on a constexpr involving 
bitfields in an unnamed struct (unnamed struct members are a g++ extension).

This is really a band-aid, because our internal representation 
BITFIELD_REF and the (premature) optimizations it encourages is 
incompatible with constexpr requirements.  There's already signs of that 
with Jakub's code to deal with fold turning things like:
   int foo: 5;
   int baz: 3;
...
   ptr->baz == cst
turns into the moral equivalent of (little endian example)
   ((*(unsigned char *)ptr & 0xe0) == (cst << 5)

In this particular case we've also made the base object the containing 
class, not the unnamed struct member.  That means we're looking in the 
wrong CONSTRUCTOR and see CONSTRUCTOR_NO_IMPLICIT_ZERO is true.  Whereas 
for the subobj's CONSTRUCTOR it is false.  With this patch we end up 
thinking this is an OK constexpr of value zero, whereas it's actually an 
uninitialized bitfield read.

But I think we could already make that mistake given the above example & 
fold's behaviour.  If 'ptr->foo' has been initialized, we'll construct a 
value using just that field and think we have a valid initialization of 
ptr->baz too.

The equivalent testcase using non-bitfields has a base object of the 
unnamed struct member, and behaves correctly (given this is an extension 
anyway).

The right solution is to fix the IR.  In the C++ FE have BITFIELD_REF 
(or a new node) look much more like COMPONENT_REF (or even be 
COMPONENT_REF, but I suspect lots of places think ADDR (COMPONENT_REF 
(...)) is legit).  And lower it to the middle-end generic representation 
in cp_genericize.  I don't think this is the right time for a change of 
that magnitude though.

Perhaps for now we should just always err on the side of safety, and 
behave as-if uninitialized if we fall of the end of looking for a bitfield?

thoughts?

nathan

Comments

Jason Merrill Jan. 23, 2017, 9:06 p.m. UTC | #1
On Mon, Jan 23, 2017 at 8:49 AM, Nathan Sidwell <nathan@acm.org> wrote:
> This patch addresses 79118, where we ICE on a constexpr involving bitfields
> in an unnamed struct (unnamed struct members are a g++ extension).
>
> This is really a band-aid, because our internal representation BITFIELD_REF
> and the (premature) optimizations it encourages is incompatible with
> constexpr requirements.  There's already signs of that with Jakub's code to
> deal with fold turning things like:
>   int foo: 5;
>   int baz: 3;
> ...
>   ptr->baz == cst
> turns into the moral equivalent of (little endian example)
>   ((*(unsigned char *)ptr & 0xe0) == (cst << 5)
>
> In this particular case we've also made the base object the containing
> class, not the unnamed struct member.  That means we're looking in the wrong
> CONSTRUCTOR and see CONSTRUCTOR_NO_IMPLICIT_ZERO is true.  Whereas for the
> subobj's CONSTRUCTOR it is false.

Why is it false?

> With this patch we end up thinking this
> is an OK constexpr of value zero, whereas it's actually an uninitialized
> bitfield read.
>
> But I think we could already make that mistake given the above example &
> fold's behaviour.  If 'ptr->foo' has been initialized, we'll construct a
> value using just that field and think we have a valid initialization of
> ptr->baz too.
>
> The equivalent testcase using non-bitfields has a base object of the unnamed
> struct member, and behaves correctly (given this is an extension anyway).
>
> The right solution is to fix the IR.  In the C++ FE have BITFIELD_REF (or a
> new node) look much more like COMPONENT_REF (or even be COMPONENT_REF, but I
> suspect lots of places think ADDR (COMPONENT_REF (...)) is legit).  And
> lower it to the middle-end generic representation in cp_genericize.  I don't
> think this is the right time for a change of that magnitude though.

Yes, for this and other reasons we should do constexpr evaluation on
the pre-genericized trees.

> Perhaps for now we should just always err on the side of safety, and behave
> as-if uninitialized if we fall of the end of looking for a bitfield?

That makes sense.

Jason
Jakub Jelinek Jan. 23, 2017, 9:13 p.m. UTC | #2
On Mon, Jan 23, 2017 at 04:06:22PM -0500, Jason Merrill wrote:
> > The right solution is to fix the IR.  In the C++ FE have BITFIELD_REF (or a
> > new node) look much more like COMPONENT_REF (or even be COMPONENT_REF, but I
> > suspect lots of places think ADDR (COMPONENT_REF (...)) is legit).  And
> > lower it to the middle-end generic representation in cp_genericize.  I don't
> > think this is the right time for a change of that magnitude though.
> 
> Yes, for this and other reasons we should do constexpr evaluation on
> the pre-genericized trees.

Maybe even pre-cp_folded.

	Jakub
Jason Merrill Jan. 23, 2017, 9:35 p.m. UTC | #3
On Mon, Jan 23, 2017 at 4:13 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Mon, Jan 23, 2017 at 04:06:22PM -0500, Jason Merrill wrote:
>> > The right solution is to fix the IR.  In the C++ FE have BITFIELD_REF (or a
>> > new node) look much more like COMPONENT_REF (or even be COMPONENT_REF, but I
>> > suspect lots of places think ADDR (COMPONENT_REF (...)) is legit).  And
>> > lower it to the middle-end generic representation in cp_genericize.  I don't
>> > think this is the right time for a change of that magnitude though.
>>
>> Yes, for this and other reasons we should do constexpr evaluation on
>> the pre-genericized trees.
>
> Maybe even pre-cp_folded.

Definitely.  I think of that as being part of genericization.

Jason
Jakub Jelinek Jan. 24, 2017, 10:49 a.m. UTC | #4
On Mon, Jan 23, 2017 at 08:49:43AM -0500, Nathan Sidwell wrote:
> This patch addresses 79118, where we ICE on a constexpr involving bitfields
> in an unnamed struct (unnamed struct members are a g++ extension).
> 
> This is really a band-aid, because our internal representation BITFIELD_REF
> and the (premature) optimizations it encourages is incompatible with
> constexpr requirements.  There's already signs of that with Jakub's code to
> deal with fold turning things like:
>   int foo: 5;
>   int baz: 3;
> ...
>   ptr->baz == cst
> turns into the moral equivalent of (little endian example)
>   ((*(unsigned char *)ptr & 0xe0) == (cst << 5)
> 
> In this particular case we've also made the base object the containing
> class, not the unnamed struct member.  That means we're looking in the wrong
> CONSTRUCTOR and see CONSTRUCTOR_NO_IMPLICIT_ZERO is true.  Whereas for the
> subobj's CONSTRUCTOR it is false.  With this patch we end up thinking this
> is an OK constexpr of value zero, whereas it's actually an uninitialized
> bitfield read.
> 
> But I think we could already make that mistake given the above example &
> fold's behaviour.  If 'ptr->foo' has been initialized, we'll construct a
> value using just that field and think we have a valid initialization of
> ptr->baz too.
> 
> The equivalent testcase using non-bitfields has a base object of the unnamed
> struct member, and behaves correctly (given this is an extension anyway).
> 
> The right solution is to fix the IR.  In the C++ FE have BITFIELD_REF (or a
> new node) look much more like COMPONENT_REF (or even be COMPONENT_REF, but I
> suspect lots of places think ADDR (COMPONENT_REF (...)) is legit).  And
> lower it to the middle-end generic representation in cp_genericize.  I don't
> think this is the right time for a change of that magnitude though.
> 
> Perhaps for now we should just always err on the side of safety, and behave
> as-if uninitialized if we fall of the end of looking for a bitfield?

Please note that say for:
struct S { int : 1; int a : 3; int : 1; int b : 2; };

int
foo (S a, S b)
{
  return a.a == b.a && a.b == b.b;
}

(haven't tried to turn that into a constexpr testcase, but shouldn't be
hard), the folding creates
((BIT_FIELD_REF <a, 8, 0> ^ BIT_FIELD_REF <b, 8, 0>) & 110) == 0
out of that.  So unless we DTRT (i.e. save constexpr bodies before
cp_fold for constexpr evaluation purposes), the workaround would need
to handle this properly (basically pattern recognize whatever the
folding may create for the bitfield tests and undo it or track bitwise which
bits are known to be constexpr and which bits are undefined and during
the BIT_AND_EXPR testing verify it isn't asking for any bits that aren't
constexpr).  And that there can be multiple bitfields covered by the same
BIT_FIELD_REF.

	Jakub
Richard Biener Jan. 24, 2017, 10:54 a.m. UTC | #5
On Tue, Jan 24, 2017 at 11:49 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Mon, Jan 23, 2017 at 08:49:43AM -0500, Nathan Sidwell wrote:
>> This patch addresses 79118, where we ICE on a constexpr involving bitfields
>> in an unnamed struct (unnamed struct members are a g++ extension).
>>
>> This is really a band-aid, because our internal representation BITFIELD_REF
>> and the (premature) optimizations it encourages is incompatible with
>> constexpr requirements.  There's already signs of that with Jakub's code to
>> deal with fold turning things like:
>>   int foo: 5;
>>   int baz: 3;
>> ...
>>   ptr->baz == cst
>> turns into the moral equivalent of (little endian example)
>>   ((*(unsigned char *)ptr & 0xe0) == (cst << 5)
>>
>> In this particular case we've also made the base object the containing
>> class, not the unnamed struct member.  That means we're looking in the wrong
>> CONSTRUCTOR and see CONSTRUCTOR_NO_IMPLICIT_ZERO is true.  Whereas for the
>> subobj's CONSTRUCTOR it is false.  With this patch we end up thinking this
>> is an OK constexpr of value zero, whereas it's actually an uninitialized
>> bitfield read.
>>
>> But I think we could already make that mistake given the above example &
>> fold's behaviour.  If 'ptr->foo' has been initialized, we'll construct a
>> value using just that field and think we have a valid initialization of
>> ptr->baz too.
>>
>> The equivalent testcase using non-bitfields has a base object of the unnamed
>> struct member, and behaves correctly (given this is an extension anyway).
>>
>> The right solution is to fix the IR.  In the C++ FE have BITFIELD_REF (or a
>> new node) look much more like COMPONENT_REF (or even be COMPONENT_REF, but I
>> suspect lots of places think ADDR (COMPONENT_REF (...)) is legit).  And
>> lower it to the middle-end generic representation in cp_genericize.  I don't
>> think this is the right time for a change of that magnitude though.
>>
>> Perhaps for now we should just always err on the side of safety, and behave
>> as-if uninitialized if we fall of the end of looking for a bitfield?
>
> Please note that say for:
> struct S { int : 1; int a : 3; int : 1; int b : 2; };
>
> int
> foo (S a, S b)
> {
>   return a.a == b.a && a.b == b.b;
> }
>
> (haven't tried to turn that into a constexpr testcase, but shouldn't be
> hard), the folding creates
> ((BIT_FIELD_REF <a, 8, 0> ^ BIT_FIELD_REF <b, 8, 0>) & 110) == 0
> out of that.  So unless we DTRT (i.e. save constexpr bodies before
> cp_fold for constexpr evaluation purposes), the workaround would need
> to handle this properly (basically pattern recognize whatever the
> folding may create for the bitfield tests and undo it or track bitwise which
> bits are known to be constexpr and which bits are undefined and during
> the BIT_AND_EXPR testing verify it isn't asking for any bits that aren't
> constexpr).  And that there can be multiple bitfields covered by the same
> BIT_FIELD_REF.

Just chiming in to say (again) this folding is bad and/or premature.  A way
to avoid it for C++ would be to move it to gimplification?  It won't
ever trigger
on GIMPLE.  Not that I like gimplification specific "simplifications" too much.

Richard.

>
>         Jakub
Jakub Jelinek Jan. 24, 2017, 10:59 a.m. UTC | #6
On Tue, Jan 24, 2017 at 11:54:09AM +0100, Richard Biener wrote:
> > Please note that say for:
> > struct S { int : 1; int a : 3; int : 1; int b : 2; };
> >
> > int
> > foo (S a, S b)
> > {
> >   return a.a == b.a && a.b == b.b;
> > }
> >
> > (haven't tried to turn that into a constexpr testcase, but shouldn't be
> > hard), the folding creates
> > ((BIT_FIELD_REF <a, 8, 0> ^ BIT_FIELD_REF <b, 8, 0>) & 110) == 0
> > out of that.  So unless we DTRT (i.e. save constexpr bodies before
> > cp_fold for constexpr evaluation purposes), the workaround would need
> > to handle this properly (basically pattern recognize whatever the
> > folding may create for the bitfield tests and undo it or track bitwise which
> > bits are known to be constexpr and which bits are undefined and during
> > the BIT_AND_EXPR testing verify it isn't asking for any bits that aren't
> > constexpr).  And that there can be multiple bitfields covered by the same
> > BIT_FIELD_REF.
> 
> Just chiming in to say (again) this folding is bad and/or premature.  A way
> to avoid it for C++ would be to move it to gimplification?  It won't
> ever trigger
> on GIMPLE.  Not that I like gimplification specific "simplifications" too much.

Sure, it won't hurt to move this optimizations later if it will not be much
harder, on the other side, there are many other reasons why constexpr
evaluation should be done on unfolded bodies (e.g., if there is an out of
bound array access or some other undefined behavior somewhere where it is
folded away (say (i = 5; j = arr[i] & 0;) on int arr[3];), C++ should still error out
during constexpr evaluation of it, while with folded trees it just won't see
it.

	Jakub
Nathan Sidwell Jan. 24, 2017, 12:01 p.m. UTC | #7
On 01/24/2017 05:49 AM, Jakub Jelinek wrote:

> ((BIT_FIELD_REF <a, 8, 0> ^ BIT_FIELD_REF <b, 8, 0>) & 110) == 0
> out of that.  So unless we DTRT (i.e. save constexpr bodies before
> cp_fold for constexpr evaluation purposes), the workaround would need
> to handle this properly (basically pattern recognize whatever the

for avoidance of doubt, I'm arguing that such folding is premature in 
the face of contexpr.

nathan
Jakub Jelinek Jan. 24, 2017, 12:25 p.m. UTC | #8
On Tue, Jan 24, 2017 at 07:01:45AM -0500, Nathan Sidwell wrote:
> On 01/24/2017 05:49 AM, Jakub Jelinek wrote:
> 
> > ((BIT_FIELD_REF <a, 8, 0> ^ BIT_FIELD_REF <b, 8, 0>) & 110) == 0
> > out of that.  So unless we DTRT (i.e. save constexpr bodies before
> > cp_fold for constexpr evaluation purposes), the workaround would need
> > to handle this properly (basically pattern recognize whatever the
> 
> for avoidance of doubt, I'm arguing that such folding is premature in the
> face of contexpr.

I'm arguing that pretty much all folding is premature in the face of
constexpr.  We e.g. accept:

constexpr int a[2] = { 1, 2 };
constexpr int foo (const int *x, int y) { return x[y] & 0; }
constexpr int b = foo (a, 1);
constexpr int c = foo (a, 2);		// { dg-error "" }
constexpr int d = foo (a, 3);		// { dg-error "" }

because we do constexpr evaluation on folded trees, while clang++ properly
rejects it for c and d.

	Jakub
Nathan Sidwell Jan. 24, 2017, 6:31 p.m. UTC | #9
On 01/23/2017 04:06 PM, Jason Merrill wrote:

>> In this particular case we've also made the base object the containing
>> class, not the unnamed struct member.  That means we're looking in the wrong
>> CONSTRUCTOR and see CONSTRUCTOR_NO_IMPLICIT_ZERO is true.  Whereas for the
>> subobj's CONSTRUCTOR it is false.
>
> Why is it false?

I thought it was because we're looking at a different level of ctor, 
investigation shows there may be a bug there too. because in one place 
we do:
      if (*valp == NULL_TREE)
	{
	  *valp = build_constructor (type, NULL);
	  CONSTRUCTOR_NO_IMPLICIT_ZERO (*valp) = no_zero_init;
and another we do:
       if (vec_safe_is_empty (*vec)
	  || (*vec)->last().index != field)
	{
	  ctor = build_constructor (TREE_TYPE (field), NULL);
	  CONSTRUCTOR_APPEND_ELT (*vec, field, ctor);

However, further looking at this problem, I discovered we're not 
properly checking the initialization of anonymous members.  Once we do 
that, we reject the ctor as a constexpr, because it fails to initialize 
the 'type' member (regardless of bitfieldness).

This patch augments cx_check_missing_mem_inits.  I change the first parm 
to be the CTYPE not the FUN from whence we pull the CTYPE.  That way we 
don't have to cons up an empty CONSTRUCTOR for the recursive case of 
discovering no initializer at all for the anon member.

With this in place we don't try and evaluate the constexpr in the 
original testcase.

ok?

nathan
Jason Merrill Jan. 24, 2017, 6:41 p.m. UTC | #10
On Tue, Jan 24, 2017 at 1:31 PM, Nathan Sidwell <nathan@acm.org> wrote:
> On 01/23/2017 04:06 PM, Jason Merrill wrote:
>
>>> In this particular case we've also made the base object the containing
>>> class, not the unnamed struct member.  That means we're looking in the
>>> wrong
>>> CONSTRUCTOR and see CONSTRUCTOR_NO_IMPLICIT_ZERO is true.  Whereas for
>>> the
>>> subobj's CONSTRUCTOR it is false.
>>
>>
>> Why is it false?
>
>
> I thought it was because we're looking at a different level of ctor,
> investigation shows there may be a bug there too. because in one place we
> do:
>      if (*valp == NULL_TREE)
>         {
>           *valp = build_constructor (type, NULL);
>           CONSTRUCTOR_NO_IMPLICIT_ZERO (*valp) = no_zero_init;
> and another we do:
>       if (vec_safe_is_empty (*vec)
>           || (*vec)->last().index != field)
>         {
>           ctor = build_constructor (TREE_TYPE (field), NULL);
>           CONSTRUCTOR_APPEND_ELT (*vec, field, ctor);
>
> However, further looking at this problem, I discovered we're not properly
> checking the initialization of anonymous members.  Once we do that, we
> reject the ctor as a constexpr, because it fails to initialize the 'type'
> member (regardless of bitfieldness).
>
> This patch augments cx_check_missing_mem_inits.  I change the first parm to
> be the CTYPE not the FUN from whence we pull the CTYPE.  That way we don't
> have to cons up an empty CONSTRUCTOR for the recursive case of discovering
> no initializer at all for the anon member.
>
> With this in place we don't try and evaluate the constexpr in the original
> testcase.
>
> ok?

I'm not seeing the patch.

Jason
diff mbox

Patch

2017-01-23  Nathan Sidwell  <nathan@acm.org>

	PR c++/79118 - bitfields and constexpr
	* constexpr.c (cxx_eval_bitfield): Return zero or set non-constant
	if no bitfield found.

	PR c++/79118
	* g++.dg/cpp0x/pr79118.C: New.

Index: cp/constexpr.c
===================================================================
--- cp/constexpr.c	(revision 244728)
+++ cp/constexpr.c	(working copy)
@@ -2447,6 +2447,11 @@  cxx_eval_bit_field_ref (const constexpr_
       tree bitpos = bit_position (field);
       if (bitpos == start && DECL_SIZE (field) == TREE_OPERAND (t, 1))
 	return value;
+
+      /* fold likes to change bitfield accesses into byte-friendly
+	 sizes and adding explicit masking. If this field is within
+	 the range of bits being extracted, accumulate into
+	 retval.  */
       if (TREE_CODE (TREE_TYPE (field)) == INTEGER_TYPE
 	  && TREE_CODE (value) == INTEGER_CST
 	  && tree_fits_shwi_p (bitpos)
@@ -2476,8 +2481,31 @@  cxx_eval_bit_field_ref (const constexpr_
     }
   if (fld_seen)
     return fold_convert (TREE_TYPE (t), retval);
-  gcc_unreachable ();
-  return error_mark_node;
+
+  if (CONSTRUCTOR_NO_IMPLICIT_ZERO (whole))
+    {
+      if (!ctx->quiet)
+	{
+	  /* Find the bitfield for a useful error message.  */
+	  for (field = TYPE_FIELDS (TREE_TYPE (TREE_OPERAND (t, 0)));
+	       field; field = TREE_CHAIN (field))
+	    if (TREE_CODE (field) == FIELD_DECL
+		&& DECL_SIZE (field) == TREE_OPERAND (t, 1)
+		&& bit_position (field) == start)
+	      break;
+	  if (field)
+	    error ("accessing uninitialized bitfield %qD", field);
+	  else
+	    error ("accessing uninitialized bitfield");
+	}
+      *non_constant_p = true;
+      return t;
+    }
+
+  /* If there's no explicit init for this field, it's value-initialized.  */
+  value = build_value_init (TREE_TYPE (t), tf_warning_or_error);
+  return cxx_eval_constant_expression (ctx, value, lval,
+				       non_constant_p, overflow_p);
 }
 
 /* Subroutine of cxx_eval_constant_expression.
Index: testsuite/g++.dg/cpp0x/pr79118.C
===================================================================
--- testsuite/g++.dg/cpp0x/pr79118.C	(revision 0)
+++ testsuite/g++.dg/cpp0x/pr79118.C	(working copy)
@@ -0,0 +1,19 @@ 
+// { dg-do compile { target c++11 } }
+// { dg-options "-std=c++11" }
+// PR c++/79118 ICE with uninitialized bit field
+
+struct Impl {
+  struct {
+    char raw;
+    char type : 2;
+  };
+
+  constexpr Impl () : raw () {}
+
+  constexpr bool get () { return type; }
+};
+
+void Foo ()
+{
+  !Impl().get();
+}