diff mbox series

c++: Fix get_member_function_from_ptrfunc with -fsanitize=bounds [PR116449]

Message ID ZtX6mn9TPzL/KRU8@tucnak
State New
Headers show
Series c++: Fix get_member_function_from_ptrfunc with -fsanitize=bounds [PR116449] | expand

Commit Message

Jakub Jelinek Sept. 2, 2024, 5:49 p.m. UTC
Hi!

The following testcase is miscompiled, because
get_member_function_from_ptrfunc
emits something like
(((FUNCTION.__pfn & 1) != 0)
 ? ptr + FUNCTION.__delta + FUNCTION.__pfn - 1
 : FUNCTION.__pfn) (ptr + FUNCTION.__delta, ...)
or so, so FUNCTION tree is used there 5 times.  There is
if (TREE_SIDE_EFFECTS (function)) function = save_expr (function);
but in this case function doesn't have side-effects, just nested ARRAY_REFs.
Now, if all the FUNCTION trees would be shared, it would work fine,
FUNCTION is evaluated in the first operand of COND_EXPR; but unfortunately
that isn't the case, both the BIT_AND_EXPR shortening and conversion to
bool done for build_conditional_expr actually unshare_expr that first
expression, but none of the other 4 are unshared.  With -fsanitize=bounds,
.UBSAN_BOUNDS calls are added to the ARRAY_REFs and use SAVE_EXPRs to avoid
evaluating the argument multiple times, but because that FUNCTION tree is
first used in the second argument of COND_EXPR (i.e. conditionally), the
SAVE_EXPR initialization is done just there and then the third argument
of COND_EXPR just uses the uninitialized temporary and so does the first
argument computation as well.

The following patch fixes it by also unsharing the trees that end up
in the third COND_EXPR argument and unsharing what is passed as the first
argument too.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Guess another possibility would be to somehow arrange for the BIT_AND_EXPR
and NE_EXPR not to do unshare_expr (but that would mean build2 most likely
rather than cp_build_binary_op), or force FUNCTION into a SAVE_EXPR whenever
it isn't say a tcc_declaration or tcc_constant even if it doesn't have
side-effects (but that would mean creating the SAVE_EXPR by hand) or create
a TARGET_EXPR instead (force_target_expr).

2024-09-02  Jakub Jelinek  <jakub@redhat.com>

	PR c++/116449
	* typeck.cc (get_member_function_from_ptrfunc): Call unshare_expr
	on *instance_ptrptr and e3.

	* g++.dg/ubsan/pr116449.C: New test.


	Jakub

Comments

Jason Merrill Sept. 4, 2024, 3:06 p.m. UTC | #1
On 9/2/24 1:49 PM, Jakub Jelinek wrote:
> Hi!
> 
> The following testcase is miscompiled, because
> get_member_function_from_ptrfunc
> emits something like
> (((FUNCTION.__pfn & 1) != 0)
>   ? ptr + FUNCTION.__delta + FUNCTION.__pfn - 1
>   : FUNCTION.__pfn) (ptr + FUNCTION.__delta, ...)
> or so, so FUNCTION tree is used there 5 times.  There is
> if (TREE_SIDE_EFFECTS (function)) function = save_expr (function);
> but in this case function doesn't have side-effects, just nested ARRAY_REFs.
> Now, if all the FUNCTION trees would be shared, it would work fine,
> FUNCTION is evaluated in the first operand of COND_EXPR; but unfortunately
> that isn't the case, both the BIT_AND_EXPR shortening and conversion to
> bool done for build_conditional_expr actually unshare_expr that first
> expression, but none of the other 4 are unshared.  With -fsanitize=bounds,
> .UBSAN_BOUNDS calls are added to the ARRAY_REFs and use SAVE_EXPRs to avoid
> evaluating the argument multiple times, but because that FUNCTION tree is
> first used in the second argument of COND_EXPR (i.e. conditionally), the
> SAVE_EXPR initialization is done just there and then the third argument
> of COND_EXPR just uses the uninitialized temporary and so does the first
> argument computation as well.

If there are SAVE_EXPRs in FUNCTION, why is TREE_SIDE_EFFECTS false?

> The following patch fixes it by also unsharing the trees that end up
> in the third COND_EXPR argument and unsharing what is passed as the first
> argument too.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Guess another possibility would be to somehow arrange for the BIT_AND_EXPR
> and NE_EXPR not to do unshare_expr (but that would mean build2 most likely
> rather than cp_build_binary_op), or force FUNCTION into a SAVE_EXPR whenever
> it isn't say a tcc_declaration or tcc_constant even if it doesn't have
> side-effects (but that would mean creating the SAVE_EXPR by hand) or create
> a TARGET_EXPR instead (force_target_expr).

It seems desirable to only do the bounds-checking once.

> 2024-09-02  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR c++/116449
> 	* typeck.cc (get_member_function_from_ptrfunc): Call unshare_expr
> 	on *instance_ptrptr and e3.
> 
> 	* g++.dg/ubsan/pr116449.C: New test.
> 
> --- gcc/cp/typeck.cc.jj	2024-07-26 08:34:18.117159928 +0200
> +++ gcc/cp/typeck.cc	2024-09-02 12:35:54.254380135 +0200
> @@ -4255,8 +4255,11 @@ get_member_function_from_ptrfunc (tree *
>         /* ...and then the delta in the PMF.  */
>         instance_ptr = fold_build_pointer_plus (instance_ptr, delta);
>   
> -      /* Hand back the adjusted 'this' argument to our caller.  */
> -      *instance_ptrptr = instance_ptr;
> +      /* Hand back the adjusted 'this' argument to our caller.
> +	 The e1 computation unfortunately can result in unshare_expr
> +	 and we need to make sure the delta tree isn't first evaluated
> +	 in the COND_EXPR branch.  */
> +      *instance_ptrptr = unshare_expr (instance_ptr);
>   
>         if (nonvirtual)
>   	/* Now just return the pointer.  */
> @@ -4283,6 +4286,9 @@ get_member_function_from_ptrfunc (tree *
>   		     cp_build_addr_expr (e2, complain));
>   
>         e2 = fold_convert (TREE_TYPE (e3), e2);
> +      /* As e1 computation can result in unshare_expr, make sure e3 isn't
> +	 shared with the e2 trees.  */
> +      e3 = unshare_expr (e3);
>         e1 = build_conditional_expr (input_location, e1, e2, e3, complain);
>         if (e1 == error_mark_node)
>   	return error_mark_node;
> --- gcc/testsuite/g++.dg/ubsan/pr116449.C.jj	2024-09-02 12:34:18.545629851 +0200
> +++ gcc/testsuite/g++.dg/ubsan/pr116449.C	2024-09-02 12:31:49.070581617 +0200
> @@ -0,0 +1,14 @@
> +// PR c++/116449
> +// { dg-do compile }
> +// { dg-options "-O2 -Wall -fsanitize=undefined" }
> +
> +struct C { void foo (int); void bar (); int c[16]; };
> +typedef void (C::*P) ();
> +struct D { P d; };
> +static D e[1] = { { &C::bar } };
> +
> +void
> +C::foo (int x)
> +{
> +  (this->*e[c[x]].d) ();
> +}
> 
> 	Jakub
>
Jakub Jelinek Sept. 4, 2024, 3:15 p.m. UTC | #2
On Wed, Sep 04, 2024 at 11:06:22AM -0400, Jason Merrill wrote:
> On 9/2/24 1:49 PM, Jakub Jelinek wrote:
> > Hi!
> > 
> > The following testcase is miscompiled, because
> > get_member_function_from_ptrfunc
> > emits something like
> > (((FUNCTION.__pfn & 1) != 0)
> >   ? ptr + FUNCTION.__delta + FUNCTION.__pfn - 1
> >   : FUNCTION.__pfn) (ptr + FUNCTION.__delta, ...)
> > or so, so FUNCTION tree is used there 5 times.  There is
> > if (TREE_SIDE_EFFECTS (function)) function = save_expr (function);
> > but in this case function doesn't have side-effects, just nested ARRAY_REFs.
> > Now, if all the FUNCTION trees would be shared, it would work fine,
> > FUNCTION is evaluated in the first operand of COND_EXPR; but unfortunately
> > that isn't the case, both the BIT_AND_EXPR shortening and conversion to
> > bool done for build_conditional_expr actually unshare_expr that first
> > expression, but none of the other 4 are unshared.  With -fsanitize=bounds,
> > .UBSAN_BOUNDS calls are added to the ARRAY_REFs and use SAVE_EXPRs to avoid
> > evaluating the argument multiple times, but because that FUNCTION tree is
> > first used in the second argument of COND_EXPR (i.e. conditionally), the
> > SAVE_EXPR initialization is done just there and then the third argument
> > of COND_EXPR just uses the uninitialized temporary and so does the first
> > argument computation as well.
> 
> If there are SAVE_EXPRs in FUNCTION, why is TREE_SIDE_EFFECTS false?

They aren't there when this function is called, they are added only during
cp_genericize when instrumenting the ARRAY_REFs with UBSAN.

And unlike this function, e.g. ubsan_instrument_bounds just calls save_expr
on the index and not if (TREE_SIDE_EFFECTS (index)) index = save_expr
(index).

> It seems desirable to only do the bounds-checking once.

So, one possibility would be to call save_expr unconditionally in
get_member_function_from_ptrfunc as well.

Or build a TARGET_EXPR (force_target_expr or similar).

	Jakub
Jason Merrill Sept. 4, 2024, 4:34 p.m. UTC | #3
On 9/4/24 11:15 AM, Jakub Jelinek wrote:
> On Wed, Sep 04, 2024 at 11:06:22AM -0400, Jason Merrill wrote:
>> On 9/2/24 1:49 PM, Jakub Jelinek wrote:
>>> Hi!
>>>
>>> The following testcase is miscompiled, because
>>> get_member_function_from_ptrfunc
>>> emits something like
>>> (((FUNCTION.__pfn & 1) != 0)
>>>    ? ptr + FUNCTION.__delta + FUNCTION.__pfn - 1
>>>    : FUNCTION.__pfn) (ptr + FUNCTION.__delta, ...)
>>> or so, so FUNCTION tree is used there 5 times.  There is
>>> if (TREE_SIDE_EFFECTS (function)) function = save_expr (function);
>>> but in this case function doesn't have side-effects, just nested ARRAY_REFs.
>>> Now, if all the FUNCTION trees would be shared, it would work fine,
>>> FUNCTION is evaluated in the first operand of COND_EXPR; but unfortunately
>>> that isn't the case, both the BIT_AND_EXPR shortening and conversion to
>>> bool done for build_conditional_expr actually unshare_expr that first
>>> expression, but none of the other 4 are unshared.  With -fsanitize=bounds,
>>> .UBSAN_BOUNDS calls are added to the ARRAY_REFs and use SAVE_EXPRs to avoid
>>> evaluating the argument multiple times, but because that FUNCTION tree is
>>> first used in the second argument of COND_EXPR (i.e. conditionally), the
>>> SAVE_EXPR initialization is done just there and then the third argument
>>> of COND_EXPR just uses the uninitialized temporary and so does the first
>>> argument computation as well.
>>
>> If there are SAVE_EXPRs in FUNCTION, why is TREE_SIDE_EFFECTS false?
> 
> They aren't there when this function is called, they are added only during
> cp_genericize when instrumenting the ARRAY_REFs with UBSAN.
> 
> And unlike this function, e.g. ubsan_instrument_bounds just calls save_expr
> on the index and not if (TREE_SIDE_EFFECTS (index)) index = save_expr
> (index).
> 
>> It seems desirable to only do the bounds-checking once.
> 
> So, one possibility would be to call save_expr unconditionally in
> get_member_function_from_ptrfunc as well.
> 
> Or build a TARGET_EXPR (force_target_expr or similar).

Yes.  I don't have a strong preference between the two.

Jason
diff mbox series

Patch

--- gcc/cp/typeck.cc.jj	2024-07-26 08:34:18.117159928 +0200
+++ gcc/cp/typeck.cc	2024-09-02 12:35:54.254380135 +0200
@@ -4255,8 +4255,11 @@  get_member_function_from_ptrfunc (tree *
       /* ...and then the delta in the PMF.  */
       instance_ptr = fold_build_pointer_plus (instance_ptr, delta);
 
-      /* Hand back the adjusted 'this' argument to our caller.  */
-      *instance_ptrptr = instance_ptr;
+      /* Hand back the adjusted 'this' argument to our caller.
+	 The e1 computation unfortunately can result in unshare_expr
+	 and we need to make sure the delta tree isn't first evaluated
+	 in the COND_EXPR branch.  */
+      *instance_ptrptr = unshare_expr (instance_ptr);
 
       if (nonvirtual)
 	/* Now just return the pointer.  */
@@ -4283,6 +4286,9 @@  get_member_function_from_ptrfunc (tree *
 		     cp_build_addr_expr (e2, complain));
 
       e2 = fold_convert (TREE_TYPE (e3), e2);
+      /* As e1 computation can result in unshare_expr, make sure e3 isn't
+	 shared with the e2 trees.  */
+      e3 = unshare_expr (e3);
       e1 = build_conditional_expr (input_location, e1, e2, e3, complain);
       if (e1 == error_mark_node)
 	return error_mark_node;
--- gcc/testsuite/g++.dg/ubsan/pr116449.C.jj	2024-09-02 12:34:18.545629851 +0200
+++ gcc/testsuite/g++.dg/ubsan/pr116449.C	2024-09-02 12:31:49.070581617 +0200
@@ -0,0 +1,14 @@ 
+// PR c++/116449
+// { dg-do compile }
+// { dg-options "-O2 -Wall -fsanitize=undefined" }
+
+struct C { void foo (int); void bar (); int c[16]; };
+typedef void (C::*P) ();
+struct D { P d; };
+static D e[1] = { { &C::bar } };
+
+void
+C::foo (int x)
+{
+  (this->*e[c[x]].d) ();
+}