Message ID | 20220617212002.3747825-1-jason@redhat.com |
---|---|
State | New |
Headers | show |
Series | [RFA] ubsan: do return check with -fsanitize=unreachable | expand |
On Fri, Jun 17, 2022 at 05:20:02PM -0400, Jason Merrill wrote: > Related to PR104642, the current situation where we get less return checking > with just -fsanitize=unreachable than no sanitize flags seems undesirable; I > propose that we do return checking when -fsanitize=unreachable. __builtin_unreachable itself (unless turned into trap or __ubsan_handle_builtin_unreachable) is not any kind of return checking, it is just an optimization. > Looks like clang just traps on missing return if not -fsanitize=return, but > the approach in this patch seems more helpful to me if we're already > sanitizing other should-be-unreachable code. > > I'm assuming that the difference in treatment of SANITIZE_UNREACHABLE and > SANITIZE_RETURN with regard to loop optimization is deliberate. return and unreachable are separate sanitizers and such silent one way implication can have quite unexpected consequences, especially with -fsanitize-trap=. Say with -fsanitize=unreachable -fsanitize-trap=unreachable, both current trunk and clang will link without -lubsan, because the only enabled UBSan sanitizers use __builtin_trap () which doesn't need library. With -fsanitize=unreachable silently meaning -fsanitize=unreachable,return the above would link in -lubsan, because while SANITIZE_UNREACHABLE uses __builtin_trap, SANITIZE_RETURN doesn't. Similarly, one has no_sanitize attribute, one could in certain function __attribute__((no_sanitize ("unreachable"))) and because on the command line using -fsanitize=unreachable assume other sanitizers aren't enabled, but the silent addition of return sanitizer would break that. > --- a/gcc/cp/cp-gimplify.cc > +++ b/gcc/cp/cp-gimplify.cc > @@ -1806,18 +1806,6 @@ cp_maybe_instrument_return (tree fndecl) > || !targetm.warn_func_return (fndecl)) > return; > > - if (!sanitize_flags_p (SANITIZE_RETURN, fndecl) > - /* Don't add __builtin_unreachable () if not optimizing, it will not > - improve any optimizations in that case, just break UB code. > - Don't add it if -fsanitize=unreachable -fno-sanitize=return either, > - UBSan covers this with ubsan_instrument_return above where sufficient > - information is provided, while the __builtin_unreachable () below > - if return sanitization is disabled will just result in hard to > - understand runtime error without location. */ > - && (!optimize > - || sanitize_flags_p (SANITIZE_UNREACHABLE, fndecl))) > - return; > - > tree t = DECL_SAVED_TREE (fndecl); > while (t) > { I think the above is correct, if -fsanitize=return, we want to fall through and use __ubsan_handle_missing_return (or __builtin_trap if -fsanitize-trap=return). Otherwise, for -O0, __builtin_unreachable most likely doesn't offer any important optimization benefits and just makes debugging bad code harder. Similarly for -fsanitize=unreachable, the __builtin_unreachable there would be an optimization which we shouldn't turn into __ubsan_handle_builtin_unreachable / __builtin_trap. Now, -funreachable-traps can of course change the condition a little bit, and so can implementation of builtin_decl_unreachable and stopping of folding of __builtin_unreachable to __builtin_trap if -fsanitize=unreachable -fsanitize-trap=unreachable. The -fsanitize=return case remains the same no matter what. Otherwise, if -funreachable-traps, we are emitting __builtin_trap rather than __builtin_unreachable, so it is perfectly fine to fall through regardless of !optimize or SANITIZE_UNREACHABLE being on, it isn't an optimization in that case, but checking. Otherwise, if !optimize, we should return, __builtin_unreachable in there wouldn't bring many advantages and just punish users of bad code. Otherwise, if builtin_decl_unreachable is implemented and we never fold __builtin_unreachable to __builtin_trap, for SANITIZE_UNREACHABLE enabled and (flag_sanitize_trap & SANITIZE_UNREACHABLE) != 0 we could emit __builtin_unreachable (but in that case directly, not through builtin_decl_unreachable). Otherwise, if SANITIZE_UNREACHABLE is on and (flag_sanitize_trap & SANITIZE_UNREACHABLE) == 0, I assume we'll still want to fold __builtin_unreachable to __ubsan_handle_builtin_unreachable during sanopt etc., we can live without the optimization and not instrument. Otherwise emit __builtin_unreachable (directly). Jakub
On 6/20/22 07:05, Jakub Jelinek wrote: > On Fri, Jun 17, 2022 at 05:20:02PM -0400, Jason Merrill wrote: >> Related to PR104642, the current situation where we get less return checking >> with just -fsanitize=unreachable than no sanitize flags seems undesirable; I >> propose that we do return checking when -fsanitize=unreachable. > > __builtin_unreachable itself (unless turned into trap or > __ubsan_handle_builtin_unreachable) is not any kind of return checking, it > is just an optimization. Yes, but I'm talking about "when -fsanitize=unreachable". >> Looks like clang just traps on missing return if not -fsanitize=return, but >> the approach in this patch seems more helpful to me if we're already >> sanitizing other should-be-unreachable code. >> >> I'm assuming that the difference in treatment of SANITIZE_UNREACHABLE and >> SANITIZE_RETURN with regard to loop optimization is deliberate. > > return and unreachable are separate sanitizers and such silent one way > implication can have quite unexpected consequences, especially with > -fsanitize-trap=. > Say with -fsanitize=unreachable -fsanitize-trap=unreachable, both current > trunk and clang will link without -lubsan, because the only enabled UBSan > sanitizers use __builtin_trap () which doesn't need library. > With -fsanitize=unreachable silently meaning -fsanitize=unreachable,return > the above would link in -lubsan, because while SANITIZE_UNREACHABLE uses > __builtin_trap, SANITIZE_RETURN doesn't. > Similarly, one has no_sanitize attribute, one could in certain function > __attribute__((no_sanitize ("unreachable"))) and because on the command > line using -fsanitize=unreachable assume other sanitizers aren't enabled, > but the silent addition of return sanitizer would break that. Ah, true. How about this approach instead?
On 6/20/22 16:16, Jason Merrill wrote: > On 6/20/22 07:05, Jakub Jelinek wrote: >> On Fri, Jun 17, 2022 at 05:20:02PM -0400, Jason Merrill wrote: >>> Related to PR104642, the current situation where we get less return >>> checking >>> with just -fsanitize=unreachable than no sanitize flags seems >>> undesirable; I >>> propose that we do return checking when -fsanitize=unreachable. >> >> __builtin_unreachable itself (unless turned into trap or >> __ubsan_handle_builtin_unreachable) is not any kind of return >> checking, it >> is just an optimization. > > Yes, but I'm talking about "when -fsanitize=unreachable". > >>> Looks like clang just traps on missing return if not >>> -fsanitize=return, but >>> the approach in this patch seems more helpful to me if we're already >>> sanitizing other should-be-unreachable code. >>> >>> I'm assuming that the difference in treatment of SANITIZE_UNREACHABLE >>> and >>> SANITIZE_RETURN with regard to loop optimization is deliberate. >> >> return and unreachable are separate sanitizers and such silent one way >> implication can have quite unexpected consequences, especially with >> -fsanitize-trap=. >> Say with -fsanitize=unreachable -fsanitize-trap=unreachable, both current >> trunk and clang will link without -lubsan, because the only enabled UBSan >> sanitizers use __builtin_trap () which doesn't need library. >> With -fsanitize=unreachable silently meaning >> -fsanitize=unreachable,return >> the above would link in -lubsan, because while SANITIZE_UNREACHABLE uses >> __builtin_trap, SANITIZE_RETURN doesn't. >> Similarly, one has no_sanitize attribute, one could in certain function >> __attribute__((no_sanitize ("unreachable"))) and because on the command >> line using -fsanitize=unreachable assume other sanitizers aren't enabled, >> but the silent addition of return sanitizer would break that. > > Ah, true. How about this approach instead? Or, this approach relies on the PR104642 patch, and just fixes the line number issue. This is less clear about the problem than using the return ubsan library function, but avoids using one entry point to implement the other sanitizer, if that's important.
On 6/22/22 00:04, Jason Merrill wrote: > On 6/20/22 16:16, Jason Merrill wrote: >> On 6/20/22 07:05, Jakub Jelinek wrote: >>> On Fri, Jun 17, 2022 at 05:20:02PM -0400, Jason Merrill wrote: >>>> Related to PR104642, the current situation where we get less return >>>> checking >>>> with just -fsanitize=unreachable than no sanitize flags seems >>>> undesirable; I >>>> propose that we do return checking when -fsanitize=unreachable. >>> >>> __builtin_unreachable itself (unless turned into trap or >>> __ubsan_handle_builtin_unreachable) is not any kind of return >>> checking, it >>> is just an optimization. >> >> Yes, but I'm talking about "when -fsanitize=unreachable". >> >>>> Looks like clang just traps on missing return if not >>>> -fsanitize=return, but >>>> the approach in this patch seems more helpful to me if we're already >>>> sanitizing other should-be-unreachable code. >>>> >>>> I'm assuming that the difference in treatment of >>>> SANITIZE_UNREACHABLE and >>>> SANITIZE_RETURN with regard to loop optimization is deliberate. >>> >>> return and unreachable are separate sanitizers and such silent one way >>> implication can have quite unexpected consequences, especially with >>> -fsanitize-trap=. >>> Say with -fsanitize=unreachable -fsanitize-trap=unreachable, both >>> current >>> trunk and clang will link without -lubsan, because the only enabled >>> UBSan >>> sanitizers use __builtin_trap () which doesn't need library. >>> With -fsanitize=unreachable silently meaning >>> -fsanitize=unreachable,return >>> the above would link in -lubsan, because while SANITIZE_UNREACHABLE uses >>> __builtin_trap, SANITIZE_RETURN doesn't. >>> Similarly, one has no_sanitize attribute, one could in certain function >>> __attribute__((no_sanitize ("unreachable"))) and because on the command >>> line using -fsanitize=unreachable assume other sanitizers aren't >>> enabled, >>> but the silent addition of return sanitizer would break that. >> >> Ah, true. How about this approach instead? > > Or, this approach relies on the PR104642 patch, and just fixes the line > number issue. This is less clear about the problem than using the > return ubsan library function, but avoids using one entry point to > implement the other sanitizer, if that's important. Ping?
On Wed, Jun 22, 2022 at 12:04:59AM -0400, Jason Merrill wrote: > On 6/20/22 16:16, Jason Merrill wrote: > > On 6/20/22 07:05, Jakub Jelinek wrote: > > > On Fri, Jun 17, 2022 at 05:20:02PM -0400, Jason Merrill wrote: > > > > Related to PR104642, the current situation where we get less > > > > return checking > > > > with just -fsanitize=unreachable than no sanitize flags seems > > > > undesirable; I > > > > propose that we do return checking when -fsanitize=unreachable. > > > > > > __builtin_unreachable itself (unless turned into trap or > > > __ubsan_handle_builtin_unreachable) is not any kind of return > > > checking, it > > > is just an optimization. > > > > Yes, but I'm talking about "when -fsanitize=unreachable". The usual case is that people just use -fsanitize=undefined and get both return and unreachable sanitization, for fall through into end of functions returning non-void done through return sanitization. In the rare case people use something different like -fsanitize=undefined -fno-sanitize=return or -fsanitize=unreachable etc., they presumably don't want the fall through from end of function diagnosed at runtime. I think the behavior we want is: 1) -fsanitize=return is on -> use ubsan_instrument_return (__ubsan_missing_return_data or __builtin_trap depending on -fsanitize-trap=return); otherwise 2) -funreachable-traps is on (from -O0/-Og by default or explicit), emit __builtin_trap; otherwise 3) -fsanitize=unreachable is on, not emit anything (__builtin_unreachable would be just an optimization, but user asked not to instrument returns, only unreachable, so honor user's decision and avoid confusion); otherwise 4) -O0 is on, not emit anything (__builtin_unreachable wouldn't be much of an optimization, just surprising and hard to debug effect); otherwise 5) emit __builtin_unreachable Current trunk with your PR104642 fix in implements 1), will do 2) only if -fsanitize=unreachable is not on, will do 3), will do 4) and 5). So, I'd change cp-gimplify.cc (cp_maybe_instrument_return), change: if (!sanitize_flags_p (SANITIZE_RETURN, fndecl) && ((!optimize && !flag_unreachable_traps) || sanitize_flags_p (SANITIZE_UNREACHABLE, fndecl))) return; to if (!sanitize_flags_p (SANITIZE_RETURN, fndecl) && !flag_unreachable_traps && (!optimize || sanitize_flags_p (SANITIZE_UNREACHABLE, fndecl))) return; and if (sanitize_flags_p (SANITIZE_RETURN, fndecl)) t = ubsan_instrument_return (loc); else t = build_builtin_unreachable (BUILTINS_LOCATION); to if (sanitize_flags_p (SANITIZE_RETURN, fndecl)) t = ubsan_instrument_return (loc); else if (flag_unreachable_traps) t = build_call_expr_loc (BUILTINS_LOCATION, builtin_decl_explicit (BUILT_IN_TRAP), 0); else t = build_builtin_unreachable (BUILTINS_LOCATION); Jakub
On 6/27/22 11:44, Jakub Jelinek wrote: > On Wed, Jun 22, 2022 at 12:04:59AM -0400, Jason Merrill wrote: >> On 6/20/22 16:16, Jason Merrill wrote: >>> On 6/20/22 07:05, Jakub Jelinek wrote: >>>> On Fri, Jun 17, 2022 at 05:20:02PM -0400, Jason Merrill wrote: >>>>> Related to PR104642, the current situation where we get less >>>>> return checking >>>>> with just -fsanitize=unreachable than no sanitize flags seems >>>>> undesirable; I >>>>> propose that we do return checking when -fsanitize=unreachable. >>>> >>>> __builtin_unreachable itself (unless turned into trap or >>>> __ubsan_handle_builtin_unreachable) is not any kind of return >>>> checking, it >>>> is just an optimization. >>> >>> Yes, but I'm talking about "when -fsanitize=unreachable". > > The usual case is that people just use -fsanitize=undefined > and get both return and unreachable sanitization, for fall through > into end of functions returning non-void done through return sanitization. > > In the rare case people use something different like > -fsanitize=undefined -fno-sanitize=return > or > -fsanitize=unreachable > etc., they presumably don't want the fall through from end of function > diagnosed at runtime. I disagree with this assumption for the second case; it seems much more likely to me that the user just wasn't thinking about needing to also mention return. Missing return is a logical subset of unreachable; if we sanitize all the other __builtin_unreachable introduced by the compiler, why in the world would we want to leave out this one that is such a frequent error? Full -fsanitize=undefined is much higher overhead than just -fsanitize=unreachable, which introduces no extra checks. And adding return checking to unreachable is essentially zero overhead. I can't imagine any reason to want to check unreachable points EXCEPT for missing return. > I think the behavior we want is: > 1) -fsanitize=return is on -> use ubsan_instrument_return > (__ubsan_missing_return_data or __builtin_trap depending on > -fsanitize-trap=return); otherwise > 2) -funreachable-traps is on (from -O0/-Og by default or explicit), > emit __builtin_trap; otherwise > 3) -fsanitize=unreachable is on, not emit anything (__builtin_unreachable > would be just an optimization, but user asked not to instrument returns, > only unreachable, so honor user's decision and avoid confusion); otherwise > 4) -O0 is on, not emit anything (__builtin_unreachable wouldn't be much > of an optimization, just surprising and hard to debug effect); otherwise > 5) emit __builtin_unreachable > > Current trunk with your PR104642 fix in implements 1), will do 2) > only if -fsanitize=unreachable is not on, will do 3), will do 4) and 5). > > So, I'd change cp-gimplify.cc (cp_maybe_instrument_return), change: > if (!sanitize_flags_p (SANITIZE_RETURN, fndecl) > && ((!optimize && !flag_unreachable_traps) > || sanitize_flags_p (SANITIZE_UNREACHABLE, fndecl))) > return; > to > if (!sanitize_flags_p (SANITIZE_RETURN, fndecl) > && !flag_unreachable_traps > && (!optimize || sanitize_flags_p (SANITIZE_UNREACHABLE, fndecl))) > return; > and > if (sanitize_flags_p (SANITIZE_RETURN, fndecl)) > t = ubsan_instrument_return (loc); > else > t = build_builtin_unreachable (BUILTINS_LOCATION); > to > if (sanitize_flags_p (SANITIZE_RETURN, fndecl)) > t = ubsan_instrument_return (loc); > else if (flag_unreachable_traps) > t = build_call_expr_loc (BUILTINS_LOCATION, > builtin_decl_explicit (BUILT_IN_TRAP), 0); > else > t = build_builtin_unreachable (BUILTINS_LOCATION); > > Jakub >
On Wed, Jun 29, 2022 at 12:42:04PM -0400, Jason Merrill wrote: > > The usual case is that people just use -fsanitize=undefined > > and get both return and unreachable sanitization, for fall through > > into end of functions returning non-void done through return sanitization. > > > > In the rare case people use something different like > > -fsanitize=undefined -fno-sanitize=return > > or > > -fsanitize=unreachable > > etc., they presumably don't want the fall through from end of function > > diagnosed at runtime. > > I disagree with this assumption for the second case; it seems much more > likely to me that the user just wasn't thinking about needing to also > mention return. Missing return is a logical subset of unreachable; if we > sanitize all the other __builtin_unreachable introduced by the compiler, why > in the world would we want to leave out this one that is such a frequent > error? UBSan was initially implemented in LLVM and our -fsanitize= options try to match where possible what they do. And their behavior is too that return and unreachable are separate sanitizers, fallthrough from function return is sanitized only for the former, they apparently at -O0 implement something like -funreachable-traps (but not at -Og) and emit a trap there (regardless of -fsanitize=unreachable), for -O1 and higher they act as if non-sanitized __builtin_unreachable () is in there regardless of -fsanitize=unreachable. It would be strange to diverge from this without a strong reason. The fact that we use __builtin_unreachable for the function ends is just our implementation detail and when we'd report that to users, they'd just be confused on what's going on. With -fsanitize=return they are told what happens. > Full -fsanitize=undefined is much higher overhead than just > -fsanitize=unreachable, which introduces no extra checks. And adding return > checking to unreachable is essentially zero overhead. I can't imagine any > reason to want to check unreachable points EXCEPT for missing return. -fsanitize=unreachable isn't zero overhead, it will force evaluation of all the conditionals guarding it and preparation of arguments for it etc. Jakub
On 6/29/22 13:26, Jakub Jelinek wrote: > On Wed, Jun 29, 2022 at 12:42:04PM -0400, Jason Merrill wrote: >>> The usual case is that people just use -fsanitize=undefined >>> and get both return and unreachable sanitization, for fall through >>> into end of functions returning non-void done through return sanitization. >>> >>> In the rare case people use something different like >>> -fsanitize=undefined -fno-sanitize=return >>> or >>> -fsanitize=unreachable >>> etc., they presumably don't want the fall through from end of function >>> diagnosed at runtime. >> >> I disagree with this assumption for the second case; it seems much more >> likely to me that the user just wasn't thinking about needing to also >> mention return. Missing return is a logical subset of unreachable; if we >> sanitize all the other __builtin_unreachable introduced by the compiler, why >> in the world would we want to leave out this one that is such a frequent >> error? > > UBSan was initially implemented in LLVM and our -fsanitize= options try to > match where possible what they do. > And their behavior is too that return and unreachable are separate > sanitizers, fallthrough from function return is sanitized only for the > former, they apparently at -O0 implement something like -funreachable-traps > (but not at -Og) and emit a trap there (regardless of > -fsanitize=unreachable), for -O1 and higher they act as if non-sanitized > __builtin_unreachable () is in there regardless of -fsanitize=unreachable. Hmm, does clang only sanitize explicit calls to __builtin_unreachable? > It would be strange to diverge from this without a strong reason. > The fact that we use __builtin_unreachable for the function ends is just our > implementation detail and when we'd report that to users, they'd just be > confused on what's going on. With -fsanitize=return they are told what > happens. > >> Full -fsanitize=undefined is much higher overhead than just >> -fsanitize=unreachable, which introduces no extra checks. And adding return >> checking to unreachable is essentially zero overhead. I can't imagine any >> reason to want to check unreachable points EXCEPT for missing return. > > -fsanitize=unreachable isn't zero overhead, it will force evaluation of all > the conditionals guarding it and preparation of arguments for it etc.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 50f57877477..e572158a1ba 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -15946,6 +15946,8 @@ built with this option turned on will issue an error message when the end of a non-void function is reached without actually returning a value. This option works in C++ only. +This check is also enabled by -fsanitize=unreachable. + @item -fsanitize=signed-integer-overflow @opindex fsanitize=signed-integer-overflow This option enables signed integer overflow checking. We check that diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc index 6f84d157c98..5c2eb61842c 100644 --- a/gcc/cp/cp-gimplify.cc +++ b/gcc/cp/cp-gimplify.cc @@ -1806,18 +1806,6 @@ cp_maybe_instrument_return (tree fndecl) || !targetm.warn_func_return (fndecl)) return; - if (!sanitize_flags_p (SANITIZE_RETURN, fndecl) - /* Don't add __builtin_unreachable () if not optimizing, it will not - improve any optimizations in that case, just break UB code. - Don't add it if -fsanitize=unreachable -fno-sanitize=return either, - UBSan covers this with ubsan_instrument_return above where sufficient - information is provided, while the __builtin_unreachable () below - if return sanitization is disabled will just result in hard to - understand runtime error without location. */ - && (!optimize - || sanitize_flags_p (SANITIZE_UNREACHABLE, fndecl))) - return; - tree t = DECL_SAVED_TREE (fndecl); while (t) { diff --git a/gcc/opts.cc b/gcc/opts.cc index 062782ac700..a7b02b0f693 100644 --- a/gcc/opts.cc +++ b/gcc/opts.cc @@ -1254,6 +1254,16 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set, if (opts->x_flag_sanitize & ~(SANITIZE_LEAK | SANITIZE_UNREACHABLE)) opts->x_flag_aggressive_loop_optimizations = 0; + /* -fsanitize=unreachable implies -fsanitize=return, but without affecting + aggressive loop optimizations. */ + if ((opts->x_flag_sanitize & (SANITIZE_UNREACHABLE | SANITIZE_RETURN)) + == SANITIZE_UNREACHABLE) + { + opts->x_flag_sanitize |= SANITIZE_RETURN; + if (opts->x_flag_sanitize_trap & SANITIZE_UNREACHABLE) + opts->x_flag_sanitize_trap |= SANITIZE_RETURN; + } + /* Enable -fsanitize-address-use-after-scope if either address sanitizer is enabled. */ if (opts->x_flag_sanitize diff --git a/gcc/testsuite/g++.dg/ubsan/return-8c.C b/gcc/testsuite/g++.dg/ubsan/return-8c.C new file mode 100644 index 00000000000..a67f086d452 --- /dev/null +++ b/gcc/testsuite/g++.dg/ubsan/return-8c.C @@ -0,0 +1,15 @@ +// PR c++/104642 + +// -fsanitize=unreachable should imply -fsanitize=return. + +// { dg-do run } +// { dg-shouldfail { *-*-* } } +// { dg-additional-options "-O -fsanitize=unreachable" } + +bool b; + +int f() { + if (b) return 42; +} // { dg-warning "-Wreturn-type" } + +int main() { f(); }