Message ID | 876111kmo5.fsf@e105548-lin.cambridge.arm.com |
---|---|
State | New |
Headers | show |
On Tue, Nov 17, 2015 at 10:19 AM, Richard Sandiford <richard.sandiford@arm.com> wrote: > Richard Biener <richard.guenther@gmail.com> writes: >> On Fri, Nov 13, 2015 at 2:12 PM, Richard Sandiford >> <richard.sandiford@arm.com> wrote: >>> Richard Biener <richard.guenther@gmail.com> writes: >>>> On Mon, Nov 9, 2015 at 10:03 PM, Michael Matz <matz@suse.de> wrote: >>>>> Hi, >>>>> >>>>> On Mon, 9 Nov 2015, Richard Sandiford wrote: >>>>> >>>>>> +static bool >>>>>> +can_use_internal_fn (gcall *call) >>>>>> +{ >>>>>> + /* Only replace calls that set errno. */ >>>>>> + if (!gimple_vdef (call)) >>>>>> + return false; >>>>> >>>>> Oh, I managed to confuse this in my head while reading the patch. So, >>>>> hmm, you don't actually replace the builtin with an internal function >>>>> (without the condition) under no-errno-math? Does something else do that? >>>>> Because otherwise that seems an unnecessary restriction? >>>>> >>>>>> >> r229916 fixed that for the non-EH case. >>>>>> > >>>>>> > Ah, missed it. Even the EH case shouldn't be difficult. If the >>>>>> > original dominator of the EH destination was the call block it moves, >>>>>> > otherwise it remains unchanged. >>>>>> >>>>>> The target of the edge is easy in itself, I agree, but that isn't >>>>>> necessarily the only affected block, if the EH handler doesn't >>>>>> exit or rethrow. >>>>> >>>>> You're worried the non-EH and the EH regions merge again, right? Like so: >>>>> >>>>> before change: >>>>> >>>>> BB1: throwing-call >>>>> fallthru/ \EH >>>>> BB2 BBeh >>>>> | /....\ (stuff in EH-region) >>>>> | / some path out of EH region >>>>> | /------------------/ >>>>> BB3 >>>>> >>>>> Here, BB3 must at least be dominated by BB1 (the throwing block), or by >>>>> something further up (when there are other side-entries to the path >>>>> BB2->BB3 or into the EH region). When further up, nothing changes, when >>>>> it's BB1, then it's afterwards dominated by the BB containing the >>>>> condition. So everything with idom==BB1 gets idom=Bcond, except for BBeh, >>>>> which gets idom=Bcall. Depending on how you split BB1, either Bcond or >>>>> BBcall might still be BB1 and doesn't lead to changes in the dom tree. >>>>> >>>>>> > Currently we have quite some of such passes (reassoc, forwprop, >>>>>> > lower_vector_ssa, cse_reciprocals, cse_sincos (sigh!), optimize_bswap >>>>>> > and others), but they are all handling only special situations in one >>>>>> > way or the other. pass_fold_builtins is another one, but it seems >>>>>> > most related to what you want (replacing a call with something else), >>>>>> > so I thought that'd be the natural choice. >>>>>> >>>>>> Well, to be pedantic, it's not really replacing the call. Except for >>>>>> the special case of targets that support direct assignments to errno, >>>>>> it keeps the original call but ensures that it isn't usually executed. >>>>>> From that point of view it doesn't really seem like a fold. >>>>>> >>>>>> But I suppose that's just naming again :-). And it's easily solved with >>>>>> s/fold/rewrite/. >>>>> >>>>> Exactly, in my mind pass_fold_builtin (like many of the others I >>>>> mentioned) doesn't do folding but rewriting :) >>>> >>>> So I am replying here to the issue of where to do the transform call_cdce >>>> does and the one Richard wants to add. For example we "lower" >>>> posix_memalign as early as GIMPLE lowering (that's before CFG construction). >>>> We also lower sincos to cexpi during GENERIC folding (or if that is dropped >>>> either GIMPLE lowering or GIMPLE folding during gimplification would be >>>> appropriate). >>>> >>>> Now, with offloading we have to avoid creating target dependencies before >>>> LTO stream-out (thus no IFN replacements before that - not sure if >>>> Richards patches have an issue there already). >>> >>> No, this patch was the earliest point at which we converted to internal >>> functions. The idea was to make code treat ECF_PURE built-in functions >>> and internal functions as being basically equivalent. There's therefore >>> not much benefit to doing a straight replacement of one with the other >>> during either GENERIC or gimple. Instead the series only used internal >>> functions for things that built-in functions couldn't do, specifically: >>> >>> - the case used in this patch, to optimise part of a non-pure built-in >>> function using a pure equivalent. >>> >>> - vector versions of built-in functions. >>> >>> The cfgexpand patch makes sure that pure built-in functions are expanded >>> like internal functions where possible. >>> >>>> Which would leave us with a lowering stage early in the main >>>> optimization pipeline - I think fold_builtins pass is way too late but >>>> any "folding" pass will do (like forwprop or backprop where the latter >>>> might be better because it might end up computing FP "ranges" to >>>> improve the initial lowering code). >>> >>> This isn't at all related to what backprop is doing though. >>> backprop is about optimising definitions based on information >>> about all uses. >>> >>> Does fold_builtins need to be as late as it is? >> >> Not sure. It folds remaining __builtin_constant_p stuff for example. > > Ah, OK. Can imagine that's quite position-sensitive. > >>>> Of course call_cdce is as good as long as it still exists. >>> >>> Does this meann that you're not against the patch in principle >>> (i.e. keeping call_cdce for now and extending it in the way that >>> this patch does)? >> >> Yes, I'm fine with extending call_cdce. Of course I'd happily >> approve a patch dissolving it into somewhere where it makes more >> sense. But this shouldn't block this patch. > > OK, thanks. Here's an updated patch with the problems that Joseph > mentioned fixed, and rebased on top of the fix for PR 68264. > > Tested on x86_64-linux-gnu, aarch64-linux-gnu, arm-linux-gnueabi > and visium-elf (for the EDOM stuff). OK to install? Ok. Thanks, Richard. > Thanks, > Richard > > > gcc/ > * builtins.c (expand_errno_check, expand_builtin_mathfn) > (expand_builtin_mathfn_2): Delete. > (expand_builtin): Remove handling of functions with > internal function equivalents. > * internal-fn.def (SET_EDOM): New internal function. > * internal-fn.h (set_edom_supported_p): Declare. > * internal-fn.c (expand_SET_EDOM): New function. > (set_edom_supported_p): Likewise. > * tree-call-cdce.c: Include builtins.h and internal-fn.h. > Rewrite comment at head of file. > (is_call_dce_candidate): Rename to... > (can_test_argument_range): ...this. Don't check gimple_call_lhs > or gimple_call_builtin_p here. > (edom_only_function): New function. > (shrink_wrap_one_built_in_call_with_conds): New function, split out > from... > (shrink_wrap_one_built_in_call): ...here. > (can_use_internal_fn, use_internal_fn): New functions. > (shrink_wrap_conditional_dead_built_in_calls): Call use_internal_fn > for calls that have an lhs. > (pass_call_cdce::gate): Remove optimize_function_for_speed_p check. > (pass_call_cdce::execute): Skip blocks that are optimized for size. > Check gimple_call_builtin_p here. Use can_use_internal_fn for > calls with an lhs. > * opts.c (default_options_table): Enable -ftree-builtin-call-cdce > at -O and above. > > diff --git a/gcc/builtins.c b/gcc/builtins.c > index c422d0d..df5c493 100644 > --- a/gcc/builtins.c > +++ b/gcc/builtins.c > @@ -101,9 +101,6 @@ static rtx expand_builtin_apply (rtx, rtx, rtx); > static void expand_builtin_return (rtx); > static enum type_class type_to_class (tree); > static rtx expand_builtin_classify_type (tree); > -static void expand_errno_check (tree, rtx); > -static rtx expand_builtin_mathfn (tree, rtx, rtx); > -static rtx expand_builtin_mathfn_2 (tree, rtx, rtx); > static rtx expand_builtin_mathfn_3 (tree, rtx, rtx); > static rtx expand_builtin_mathfn_ternary (tree, rtx, rtx); > static rtx expand_builtin_interclass_mathfn (tree, rtx); > @@ -1972,286 +1969,6 @@ replacement_internal_fn (gcall *call) > return IFN_LAST; > } > > -/* If errno must be maintained, expand the RTL to check if the result, > - TARGET, of a built-in function call, EXP, is NaN, and if so set > - errno to EDOM. */ > - > -static void > -expand_errno_check (tree exp, rtx target) > -{ > - rtx_code_label *lab = gen_label_rtx (); > - > - /* Test the result; if it is NaN, set errno=EDOM because > - the argument was not in the domain. */ > - do_compare_rtx_and_jump (target, target, EQ, 0, GET_MODE (target), > - NULL_RTX, NULL, lab, > - /* The jump is very likely. */ > - REG_BR_PROB_BASE - (REG_BR_PROB_BASE / 2000 - 1)); > - > -#ifdef TARGET_EDOM > - /* If this built-in doesn't throw an exception, set errno directly. */ > - if (TREE_NOTHROW (TREE_OPERAND (CALL_EXPR_FN (exp), 0))) > - { > -#ifdef GEN_ERRNO_RTX > - rtx errno_rtx = GEN_ERRNO_RTX; > -#else > - rtx errno_rtx > - = gen_rtx_MEM (word_mode, gen_rtx_SYMBOL_REF (Pmode, "errno")); > -#endif > - emit_move_insn (errno_rtx, > - gen_int_mode (TARGET_EDOM, GET_MODE (errno_rtx))); > - emit_label (lab); > - return; > - } > -#endif > - > - /* Make sure the library call isn't expanded as a tail call. */ > - CALL_EXPR_TAILCALL (exp) = 0; > - > - /* We can't set errno=EDOM directly; let the library call do it. > - Pop the arguments right away in case the call gets deleted. */ > - NO_DEFER_POP; > - expand_call (exp, target, 0); > - OK_DEFER_POP; > - emit_label (lab); > -} > - > -/* Expand a call to one of the builtin math functions (sqrt, exp, or log). > - Return NULL_RTX if a normal call should be emitted rather than expanding > - the function in-line. EXP is the expression that is a call to the builtin > - function; if convenient, the result should be placed in TARGET. > - SUBTARGET may be used as the target for computing one of EXP's operands. */ > - > -static rtx > -expand_builtin_mathfn (tree exp, rtx target, rtx subtarget) > -{ > - optab builtin_optab; > - rtx op0; > - rtx_insn *insns; > - tree fndecl = get_callee_fndecl (exp); > - machine_mode mode; > - bool errno_set = false; > - bool try_widening = false; > - tree arg; > - > - if (!validate_arglist (exp, REAL_TYPE, VOID_TYPE)) > - return NULL_RTX; > - > - arg = CALL_EXPR_ARG (exp, 0); > - > - switch (DECL_FUNCTION_CODE (fndecl)) > - { > - CASE_FLT_FN (BUILT_IN_SQRT): > - errno_set = ! tree_expr_nonnegative_p (arg); > - try_widening = true; > - builtin_optab = sqrt_optab; > - break; > - CASE_FLT_FN (BUILT_IN_EXP): > - errno_set = true; builtin_optab = exp_optab; break; > - CASE_FLT_FN (BUILT_IN_EXP10): > - CASE_FLT_FN (BUILT_IN_POW10): > - errno_set = true; builtin_optab = exp10_optab; break; > - CASE_FLT_FN (BUILT_IN_EXP2): > - errno_set = true; builtin_optab = exp2_optab; break; > - CASE_FLT_FN (BUILT_IN_EXPM1): > - errno_set = true; builtin_optab = expm1_optab; break; > - CASE_FLT_FN (BUILT_IN_LOGB): > - errno_set = true; builtin_optab = logb_optab; break; > - CASE_FLT_FN (BUILT_IN_LOG): > - errno_set = true; builtin_optab = log_optab; break; > - CASE_FLT_FN (BUILT_IN_LOG10): > - errno_set = true; builtin_optab = log10_optab; break; > - CASE_FLT_FN (BUILT_IN_LOG2): > - errno_set = true; builtin_optab = log2_optab; break; > - CASE_FLT_FN (BUILT_IN_LOG1P): > - errno_set = true; builtin_optab = log1p_optab; break; > - CASE_FLT_FN (BUILT_IN_ASIN): > - builtin_optab = asin_optab; break; > - CASE_FLT_FN (BUILT_IN_ACOS): > - builtin_optab = acos_optab; break; > - CASE_FLT_FN (BUILT_IN_TAN): > - builtin_optab = tan_optab; break; > - CASE_FLT_FN (BUILT_IN_ATAN): > - builtin_optab = atan_optab; break; > - CASE_FLT_FN (BUILT_IN_FLOOR): > - builtin_optab = floor_optab; break; > - CASE_FLT_FN (BUILT_IN_CEIL): > - builtin_optab = ceil_optab; break; > - CASE_FLT_FN (BUILT_IN_TRUNC): > - builtin_optab = btrunc_optab; break; > - CASE_FLT_FN (BUILT_IN_ROUND): > - builtin_optab = round_optab; break; > - CASE_FLT_FN (BUILT_IN_NEARBYINT): > - builtin_optab = nearbyint_optab; > - if (flag_trapping_math) > - break; > - /* Else fallthrough and expand as rint. */ > - CASE_FLT_FN (BUILT_IN_RINT): > - builtin_optab = rint_optab; break; > - CASE_FLT_FN (BUILT_IN_SIGNIFICAND): > - builtin_optab = significand_optab; break; > - default: > - gcc_unreachable (); > - } > - > - /* Make a suitable register to place result in. */ > - mode = TYPE_MODE (TREE_TYPE (exp)); > - > - if (! flag_errno_math || ! HONOR_NANS (mode)) > - errno_set = false; > - > - /* Before working hard, check whether the instruction is available, but try > - to widen the mode for specific operations. */ > - if ((optab_handler (builtin_optab, mode) != CODE_FOR_nothing > - || (try_widening && !excess_precision_type (TREE_TYPE (exp)))) > - && (!errno_set || !optimize_insn_for_size_p ())) > - { > - rtx result = gen_reg_rtx (mode); > - > - /* Wrap the computation of the argument in a SAVE_EXPR, as we may > - need to expand the argument again. This way, we will not perform > - side-effects more the once. */ > - CALL_EXPR_ARG (exp, 0) = arg = builtin_save_expr (arg); > - > - op0 = expand_expr (arg, subtarget, VOIDmode, EXPAND_NORMAL); > - > - start_sequence (); > - > - /* Compute into RESULT. > - Set RESULT to wherever the result comes back. */ > - result = expand_unop (mode, builtin_optab, op0, result, 0); > - > - if (result != 0) > - { > - if (errno_set) > - expand_errno_check (exp, result); > - > - /* Output the entire sequence. */ > - insns = get_insns (); > - end_sequence (); > - emit_insn (insns); > - return result; > - } > - > - /* If we were unable to expand via the builtin, stop the sequence > - (without outputting the insns) and call to the library function > - with the stabilized argument list. */ > - end_sequence (); > - } > - > - return expand_call (exp, target, target == const0_rtx); > -} > - > -/* Expand a call to the builtin binary math functions (pow and atan2). > - Return NULL_RTX if a normal call should be emitted rather than expanding the > - function in-line. EXP is the expression that is a call to the builtin > - function; if convenient, the result should be placed in TARGET. > - SUBTARGET may be used as the target for computing one of EXP's > - operands. */ > - > -static rtx > -expand_builtin_mathfn_2 (tree exp, rtx target, rtx subtarget) > -{ > - optab builtin_optab; > - rtx op0, op1, result; > - rtx_insn *insns; > - int op1_type = REAL_TYPE; > - tree fndecl = get_callee_fndecl (exp); > - tree arg0, arg1; > - machine_mode mode; > - bool errno_set = true; > - > - switch (DECL_FUNCTION_CODE (fndecl)) > - { > - CASE_FLT_FN (BUILT_IN_SCALBN): > - CASE_FLT_FN (BUILT_IN_SCALBLN): > - CASE_FLT_FN (BUILT_IN_LDEXP): > - op1_type = INTEGER_TYPE; > - default: > - break; > - } > - > - if (!validate_arglist (exp, REAL_TYPE, op1_type, VOID_TYPE)) > - return NULL_RTX; > - > - arg0 = CALL_EXPR_ARG (exp, 0); > - arg1 = CALL_EXPR_ARG (exp, 1); > - > - switch (DECL_FUNCTION_CODE (fndecl)) > - { > - CASE_FLT_FN (BUILT_IN_POW): > - builtin_optab = pow_optab; break; > - CASE_FLT_FN (BUILT_IN_ATAN2): > - builtin_optab = atan2_optab; break; > - CASE_FLT_FN (BUILT_IN_SCALB): > - if (REAL_MODE_FORMAT (TYPE_MODE (TREE_TYPE (exp)))->b != 2) > - return 0; > - builtin_optab = scalb_optab; break; > - CASE_FLT_FN (BUILT_IN_SCALBN): > - CASE_FLT_FN (BUILT_IN_SCALBLN): > - if (REAL_MODE_FORMAT (TYPE_MODE (TREE_TYPE (exp)))->b != 2) > - return 0; > - /* Fall through... */ > - CASE_FLT_FN (BUILT_IN_LDEXP): > - builtin_optab = ldexp_optab; break; > - CASE_FLT_FN (BUILT_IN_FMOD): > - builtin_optab = fmod_optab; break; > - CASE_FLT_FN (BUILT_IN_REMAINDER): > - CASE_FLT_FN (BUILT_IN_DREM): > - builtin_optab = remainder_optab; break; > - default: > - gcc_unreachable (); > - } > - > - /* Make a suitable register to place result in. */ > - mode = TYPE_MODE (TREE_TYPE (exp)); > - > - /* Before working hard, check whether the instruction is available. */ > - if (optab_handler (builtin_optab, mode) == CODE_FOR_nothing) > - return NULL_RTX; > - > - result = gen_reg_rtx (mode); > - > - if (! flag_errno_math || ! HONOR_NANS (mode)) > - errno_set = false; > - > - if (errno_set && optimize_insn_for_size_p ()) > - return 0; > - > - /* Always stabilize the argument list. */ > - CALL_EXPR_ARG (exp, 0) = arg0 = builtin_save_expr (arg0); > - CALL_EXPR_ARG (exp, 1) = arg1 = builtin_save_expr (arg1); > - > - op0 = expand_expr (arg0, subtarget, VOIDmode, EXPAND_NORMAL); > - op1 = expand_normal (arg1); > - > - start_sequence (); > - > - /* Compute into RESULT. > - Set RESULT to wherever the result comes back. */ > - result = expand_binop (mode, builtin_optab, op0, op1, > - result, 0, OPTAB_DIRECT); > - > - /* If we were unable to expand via the builtin, stop the sequence > - (without outputting the insns) and call to the library function > - with the stabilized argument list. */ > - if (result == 0) > - { > - end_sequence (); > - return expand_call (exp, target, target == const0_rtx); > - } > - > - if (errno_set) > - expand_errno_check (exp, result); > - > - /* Output the entire sequence. */ > - insns = get_insns (); > - end_sequence (); > - emit_insn (insns); > - > - return result; > -} > - > /* Expand a call to the builtin trinary math functions (fma). > Return NULL_RTX if a normal call should be emitted rather than expanding the > function in-line. EXP is the expression that is a call to the builtin > @@ -5984,37 +5701,6 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode, > CASE_FLT_FN (BUILT_IN_CABS): > break; > > - CASE_FLT_FN (BUILT_IN_EXP): > - CASE_FLT_FN (BUILT_IN_EXP10): > - CASE_FLT_FN (BUILT_IN_POW10): > - CASE_FLT_FN (BUILT_IN_EXP2): > - CASE_FLT_FN (BUILT_IN_EXPM1): > - CASE_FLT_FN (BUILT_IN_LOGB): > - CASE_FLT_FN (BUILT_IN_LOG): > - CASE_FLT_FN (BUILT_IN_LOG10): > - CASE_FLT_FN (BUILT_IN_LOG2): > - CASE_FLT_FN (BUILT_IN_LOG1P): > - CASE_FLT_FN (BUILT_IN_TAN): > - CASE_FLT_FN (BUILT_IN_ASIN): > - CASE_FLT_FN (BUILT_IN_ACOS): > - CASE_FLT_FN (BUILT_IN_ATAN): > - CASE_FLT_FN (BUILT_IN_SIGNIFICAND): > - /* Treat these like sqrt only if unsafe math optimizations are allowed, > - because of possible accuracy problems. */ > - if (! flag_unsafe_math_optimizations) > - break; > - CASE_FLT_FN (BUILT_IN_SQRT): > - CASE_FLT_FN (BUILT_IN_FLOOR): > - CASE_FLT_FN (BUILT_IN_CEIL): > - CASE_FLT_FN (BUILT_IN_TRUNC): > - CASE_FLT_FN (BUILT_IN_ROUND): > - CASE_FLT_FN (BUILT_IN_NEARBYINT): > - CASE_FLT_FN (BUILT_IN_RINT): > - target = expand_builtin_mathfn (exp, target, subtarget); > - if (target) > - return target; > - break; > - > CASE_FLT_FN (BUILT_IN_FMA): > target = expand_builtin_mathfn_ternary (exp, target, subtarget); > if (target) > @@ -6061,23 +5747,6 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode, > return target; > break; > > - CASE_FLT_FN (BUILT_IN_ATAN2): > - CASE_FLT_FN (BUILT_IN_LDEXP): > - CASE_FLT_FN (BUILT_IN_SCALB): > - CASE_FLT_FN (BUILT_IN_SCALBN): > - CASE_FLT_FN (BUILT_IN_SCALBLN): > - if (! flag_unsafe_math_optimizations) > - break; > - > - CASE_FLT_FN (BUILT_IN_FMOD): > - CASE_FLT_FN (BUILT_IN_REMAINDER): > - CASE_FLT_FN (BUILT_IN_DREM): > - CASE_FLT_FN (BUILT_IN_POW): > - target = expand_builtin_mathfn_2 (exp, target, subtarget); > - if (target) > - return target; > - break; > - > CASE_FLT_FN (BUILT_IN_CEXPI): > target = expand_builtin_cexpi (exp, target); > gcc_assert (target); > diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c > index f23d799..06c5d9e 100644 > --- a/gcc/internal-fn.c > +++ b/gcc/internal-fn.c > @@ -2073,6 +2073,24 @@ expand_GOACC_REDUCTION (internal_fn, gcall *) > gcc_unreachable (); > } > > +/* Set errno to EDOM. */ > + > +static void > +expand_SET_EDOM (internal_fn, gcall *) > +{ > +#ifdef TARGET_EDOM > +#ifdef GEN_ERRNO_RTX > + rtx errno_rtx = GEN_ERRNO_RTX; > +#else > + rtx errno_rtx = gen_rtx_MEM (word_mode, gen_rtx_SYMBOL_REF (Pmode, "errno")); > +#endif > + emit_move_insn (errno_rtx, > + gen_int_mode (TARGET_EDOM, GET_MODE (errno_rtx))); > +#else > + gcc_unreachable (); > +#endif > +} > + > /* Expand a call to FN using the operands in STMT. FN has a single > output operand and NARGS input operands. */ > > @@ -2217,6 +2235,18 @@ direct_internal_fn_supported_p (internal_fn fn, tree type) > return direct_internal_fn_supported_p (fn, tree_pair (type, type)); > } > > +/* Return true if IFN_SET_EDOM is supported. */ > + > +bool > +set_edom_supported_p (void) > +{ > +#ifdef TARGET_EDOM > + return true; > +#else > + return false; > +#endif > +} > + > #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) \ > static void \ > expand_##CODE (internal_fn fn, gcall *stmt) \ > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def > index bf8047a..825dba1 100644 > --- a/gcc/internal-fn.def > +++ b/gcc/internal-fn.def > @@ -181,6 +181,10 @@ DEF_INTERNAL_FN (GOACC_LOOP, ECF_PURE | ECF_NOTHROW, NULL) > /* OpenACC reduction abstraction. See internal-fn.h for usage. */ > DEF_INTERNAL_FN (GOACC_REDUCTION, ECF_NOTHROW | ECF_LEAF, NULL) > > +/* Set errno to EDOM, if GCC knows how to do that directly for the > + current target. */ > +DEF_INTERNAL_FN (SET_EDOM, ECF_LEAF | ECF_NOTHROW, NULL) > + > #undef DEF_INTERNAL_INT_FN > #undef DEF_INTERNAL_FLT_FN > #undef DEF_INTERNAL_OPTAB_FN > diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h > index 5ee43b8..6cb123f 100644 > --- a/gcc/internal-fn.h > +++ b/gcc/internal-fn.h > @@ -160,6 +160,7 @@ extern tree_pair direct_internal_fn_types (internal_fn, tree, tree *); > extern tree_pair direct_internal_fn_types (internal_fn, gcall *); > extern bool direct_internal_fn_supported_p (internal_fn, tree_pair); > extern bool direct_internal_fn_supported_p (internal_fn, tree); > +extern bool set_edom_supported_p (void); > > extern void expand_internal_call (gcall *); > extern void expand_internal_call (internal_fn, gcall *); > diff --git a/gcc/opts.c b/gcc/opts.c > index be04cf5..4345cc8 100644 > --- a/gcc/opts.c > +++ b/gcc/opts.c > @@ -478,6 +478,7 @@ static const struct default_options default_options_table[] = > { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fmove_loop_invariants, NULL, 1 }, > { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_pta, NULL, 1 }, > { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fssa_phiopt, NULL, 1 }, > + { OPT_LEVELS_1_PLUS, OPT_ftree_builtin_call_dce, NULL, 1 }, > > /* -O2 optimizations. */ > { OPT_LEVELS_2_PLUS, OPT_finline_small_functions, NULL, 1 }, > @@ -503,7 +504,6 @@ static const struct default_options default_options_table[] = > REORDER_BLOCKS_ALGORITHM_STC }, > { OPT_LEVELS_2_PLUS, OPT_freorder_functions, NULL, 1 }, > { OPT_LEVELS_2_PLUS, OPT_ftree_vrp, NULL, 1 }, > - { OPT_LEVELS_2_PLUS, OPT_ftree_builtin_call_dce, NULL, 1 }, > { OPT_LEVELS_2_PLUS, OPT_ftree_pre, NULL, 1 }, > { OPT_LEVELS_2_PLUS, OPT_ftree_switch_conversion, NULL, 1 }, > { OPT_LEVELS_2_PLUS, OPT_fipa_cp, NULL, 1 }, > diff --git a/gcc/tree-call-cdce.c b/gcc/tree-call-cdce.c > index fbcc70b..75ef180 100644 > --- a/gcc/tree-call-cdce.c > +++ b/gcc/tree-call-cdce.c > @@ -33,46 +33,77 @@ along with GCC; see the file COPYING3. If not see > #include "gimple-iterator.h" > #include "tree-cfg.h" > #include "tree-into-ssa.h" > +#include "builtins.h" > +#include "internal-fn.h" > > > -/* Conditional dead call elimination > +/* This pass serves two closely-related purposes: > > - Some builtin functions can set errno on error conditions, but they > - are otherwise pure. If the result of a call to such a function is > - not used, the compiler can still not eliminate the call without > - powerful interprocedural analysis to prove that the errno is not > - checked. However, if the conditions under which the error occurs > - are known, the compiler can conditionally dead code eliminate the > - calls by shrink-wrapping the semi-dead calls into the error condition: > + 1. It conditionally executes calls that set errno if (a) the result of > + the call is unused and (b) a simple range check on the arguments can > + detect most cases where errno does not need to be set. > > - built_in_call (args) > - ==> > - if (error_cond (args)) > - built_in_call (args) > + This is the "conditional dead-code elimination" that gave the pass > + its original name, since the call is dead for most argument values. > + The calls for which it helps are usually part of the C++ abstraction > + penalty exposed after inlining. > + > + 2. It looks for calls to built-in functions that set errno and whose > + result is used. It checks whether there is an associated internal > + function that doesn't set errno and whether the target supports > + that internal function. If so, the pass uses the internal function > + to compute the result of the built-in function but still arranges > + for errno to be set when necessary. There are two ways of setting > + errno: > + > + a. by protecting the original call with the same argument checks as (1) > + > + b. by protecting the original call with a check that the result > + of the internal function is not equal to itself (i.e. is NaN). > + > + (b) requires that NaNs are the only erroneous results. It is not > + appropriate for functions like log, which returns ERANGE for zero > + arguments. (b) is also likely to perform worse than (a) because it > + requires the result to be calculated first. The pass therefore uses > + (a) when it can and uses (b) as a fallback. > + > + For (b) the pass can replace the original call with a call to > + IFN_SET_EDOM, if the target supports direct assignments to errno. > + > + In both cases, arguments that require errno to be set should occur > + rarely in practice. Checks of the errno result should also be rare, > + but the compiler would need powerful interprocedural analysis to > + prove that errno is not checked. It's much easier to add argument > + checks or result checks instead. > + > + An example of (1) is: > > - An actual simple example is : > log (x); // Mostly dead call > ==> > if (__builtin_islessequal (x, 0)) > log (x); > > With this change, call to log (x) is effectively eliminated, as > - in majority of the cases, log won't be called with x out of > + in the majority of the cases, log won't be called with x out of > range. The branch is totally predictable, so the branch cost > is low. > > + An example of (2) is: > + > + y = sqrt (x); > + ==> > + y = IFN_SQRT (x); > + if (__builtin_isless (x, 0)) > + sqrt (x); > + > + In the vast majority of cases we should then never need to call sqrt. > + > Note that library functions are not supposed to clear errno to zero without > error. See IEEE Std 1003.1, section 2.3 Error Numbers, and section 7.5:3 of > ISO/IEC 9899 (C99). > > The condition wrapping the builtin call is conservatively set to avoid too > - aggressive (wrong) shrink wrapping. The optimization is called conditional > - dead call elimination because the call is eliminated under the condition > - that the input arguments would not lead to domain or range error (for > - instance when x <= 0 for a log (x) call), however the chances that the error > - condition is hit is very low (those builtin calls which are conditionally > - dead are usually part of the C++ abstraction penalty exposed after > - inlining). */ > + aggressive (wrong) shrink wrapping. */ > > > /* A structure for representing input domain of > @@ -251,28 +282,15 @@ check_builtin_call (gcall *bcall) > return check_target_format (arg); > } > > -/* A helper function to determine if a builtin function call is a > - candidate for conditional DCE. Returns true if the builtin call > - is a candidate. */ > +/* Return true if built-in function call CALL calls a math function > + and if we know how to test the range of its arguments to detect _most_ > + situations in which errno is not set. The test must err on the side > + of treating non-erroneous values as potentially erroneous. */ > > static bool > -is_call_dce_candidate (gcall *call) > +can_test_argument_range (gcall *call) > { > - tree fn; > - enum built_in_function fnc; > - > - /* Only potentially dead calls are considered. */ > - if (gimple_call_lhs (call)) > - return false; > - > - fn = gimple_call_fndecl (call); > - if (!fn > - || !DECL_BUILT_IN (fn) > - || (DECL_BUILT_IN_CLASS (fn) != BUILT_IN_NORMAL)) > - return false; > - > - fnc = DECL_FUNCTION_CODE (fn); > - switch (fnc) > + switch (DECL_FUNCTION_CODE (gimple_call_fndecl (call))) > { > /* Trig functions. */ > CASE_FLT_FN (BUILT_IN_ACOS): > @@ -306,6 +324,31 @@ is_call_dce_candidate (gcall *call) > return false; > } > > +/* Return true if CALL can produce a domain error (EDOM) but can never > + produce a pole, range overflow or range underflow error (all ERANGE). > + This means that we can tell whether a function would have set errno > + by testing whether the result is a NaN. */ > + > +static bool > +edom_only_function (gcall *call) > +{ > + switch (DECL_FUNCTION_CODE (gimple_call_fndecl (call))) > + { > + CASE_FLT_FN (BUILT_IN_ACOS): > + CASE_FLT_FN (BUILT_IN_ASIN): > + CASE_FLT_FN (BUILT_IN_ATAN): > + CASE_FLT_FN (BUILT_IN_COS): > + CASE_FLT_FN (BUILT_IN_SIGNIFICAND): > + CASE_FLT_FN (BUILT_IN_SIN): > + CASE_FLT_FN (BUILT_IN_SQRT): > + CASE_FLT_FN (BUILT_IN_FMOD): > + CASE_FLT_FN (BUILT_IN_REMAINDER): > + return true; > + > + default: > + return false; > + } > +} > > /* A helper function to generate gimple statements for one bound > comparison, so that the built-in function is called whenever > @@ -703,33 +746,24 @@ gen_shrink_wrap_conditions (gcall *bi_call, vec<gimple *> conds, > /* Probability of the branch (to the call) is taken. */ > #define ERR_PROB 0.01 > > -/* The function to shrink wrap a partially dead builtin call > - whose return value is not used anywhere, but has to be kept > - live due to potential error condition. Returns true if the > - transformation actually happens. */ > +/* Shrink-wrap BI_CALL so that it is only called when one of the NCONDS > + conditions in CONDS is false. > + > + Return true on success, in which case the cfg will have been updated. */ > > static bool > -shrink_wrap_one_built_in_call (gcall *bi_call) > +shrink_wrap_one_built_in_call_with_conds (gcall *bi_call, vec <gimple *> conds, > + unsigned int nconds) > { > gimple_stmt_iterator bi_call_bsi; > basic_block bi_call_bb, join_tgt_bb, guard_bb; > edge join_tgt_in_edge_from_call, join_tgt_in_edge_fall_thru; > edge bi_call_in_edge0, guard_bb_in_edge; > - unsigned tn_cond_stmts, nconds; > + unsigned tn_cond_stmts; > unsigned ci; > gimple *cond_expr = NULL; > gimple *cond_expr_start; > > - auto_vec<gimple *, 12> conds; > - gen_shrink_wrap_conditions (bi_call, conds, &nconds); > - > - /* This can happen if the condition generator decides > - it is not beneficial to do the transformation. Just > - return false and do not do any transformation for > - the call. */ > - if (nconds == 0) > - return false; > - > /* The cfg we want to create looks like this: > > [guard n-1] <- guard_bb (old block) > @@ -868,6 +902,117 @@ shrink_wrap_one_built_in_call (gcall *bi_call) > return true; > } > > +/* Shrink-wrap BI_CALL so that it is only called when it might set errno > + (but is always called if it would set errno). > + > + Return true on success, in which case the cfg will have been updated. */ > + > +static bool > +shrink_wrap_one_built_in_call (gcall *bi_call) > +{ > + unsigned nconds = 0; > + auto_vec<gimple *, 12> conds; > + gen_shrink_wrap_conditions (bi_call, conds, &nconds); > + /* This can happen if the condition generator decides > + it is not beneficial to do the transformation. Just > + return false and do not do any transformation for > + the call. */ > + if (nconds == 0) > + return false; > + return shrink_wrap_one_built_in_call_with_conds (bi_call, conds, nconds); > +} > + > +/* Return true if built-in function call CALL could be implemented using > + a combination of an internal function to compute the result and a > + separate call to set errno. */ > + > +static bool > +can_use_internal_fn (gcall *call) > +{ > + /* Only replace calls that set errno. */ > + if (!gimple_vdef (call)) > + return false; > + > + /* Punt if we can't conditionalize the call. */ > + basic_block bb = gimple_bb (call); > + if (stmt_ends_bb_p (call) && !find_fallthru_edge (bb->succs)) > + return false; > + > + /* See whether there is an internal function for this built-in. */ > + if (replacement_internal_fn (call) == IFN_LAST) > + return false; > + > + /* See whether we can catch all cases where errno would be set, > + while still avoiding the call in most cases. */ > + if (!can_test_argument_range (call) > + && !edom_only_function (call)) > + return false; > + > + return true; > +} > + > +/* Implement built-in function call CALL using an internal function. > + Return true on success, in which case the cfg will have changed. */ > + > +static bool > +use_internal_fn (gcall *call) > +{ > + unsigned nconds = 0; > + auto_vec<gimple *, 12> conds; > + gen_shrink_wrap_conditions (call, conds, &nconds); > + if (nconds == 0 && !edom_only_function (call)) > + return false; > + > + internal_fn ifn = replacement_internal_fn (call); > + gcc_assert (ifn != IFN_LAST); > + > + /* Construct the new call, with the same arguments as the original one. */ > + auto_vec <tree, 16> args; > + unsigned int nargs = gimple_call_num_args (call); > + for (unsigned int i = 0; i < nargs; ++i) > + args.safe_push (gimple_call_arg (call, i)); > + gcall *new_call = gimple_build_call_internal_vec (ifn, args); > + gimple_set_location (new_call, gimple_location (call)); > + > + /* Transfer the LHS to the new call. */ > + tree lhs = gimple_call_lhs (call); > + gimple_call_set_lhs (new_call, lhs); > + gimple_call_set_lhs (call, NULL_TREE); > + SSA_NAME_DEF_STMT (lhs) = new_call; > + > + /* Insert the new call. */ > + gimple_stmt_iterator gsi = gsi_for_stmt (call); > + gsi_insert_before (&gsi, new_call, GSI_SAME_STMT); > + > + if (nconds == 0) > + { > + /* Skip the call if LHS == LHS. If we reach here, EDOM is the only > + valid errno value and it is used iff the result is NaN. */ > + conds.quick_push (gimple_build_cond (EQ_EXPR, lhs, lhs, > + NULL_TREE, NULL_TREE)); > + nconds++; > + > + /* Try replacing the original call with a direct assignment to > + errno, via an internal function. */ > + if (set_edom_supported_p () && !stmt_ends_bb_p (call)) > + { > + gimple_stmt_iterator gsi = gsi_for_stmt (call); > + gcall *new_call = gimple_build_call_internal (IFN_SET_EDOM, 0); > + gimple_set_vuse (new_call, gimple_vuse (call)); > + gimple_set_vdef (new_call, gimple_vdef (call)); > + SSA_NAME_DEF_STMT (gimple_vdef (new_call)) = new_call; > + gimple_set_location (new_call, gimple_location (call)); > + gsi_replace (&gsi, new_call, false); > + call = new_call; > + } > + } > + > + if (!shrink_wrap_one_built_in_call_with_conds (call, conds, nconds)) > + /* It's too late to back out now. */ > + gcc_unreachable (); > + return true; > +} > + > /* The top level function for conditional dead code shrink > wrapping transformation. */ > > @@ -884,7 +1029,10 @@ shrink_wrap_conditional_dead_built_in_calls (vec<gcall *> calls) > for (; i < n ; i++) > { > gcall *bi_call = calls[i]; > - changed |= shrink_wrap_one_built_in_call (bi_call); > + if (gimple_call_lhs (bi_call)) > + changed |= use_internal_fn (bi_call); > + else > + changed |= shrink_wrap_one_built_in_call (bi_call); > } > > return changed; > @@ -913,13 +1061,12 @@ public: > {} > > /* opt_pass methods: */ > - virtual bool gate (function *fun) > + virtual bool gate (function *) > { > /* The limit constants used in the implementation > assume IEEE floating point format. Other formats > can be supported in the future if needed. */ > - return flag_tree_builtin_call_dce != 0 > - && optimize_function_for_speed_p (fun); > + return flag_tree_builtin_call_dce != 0; > } > > virtual unsigned int execute (function *); > @@ -935,11 +1082,20 @@ pass_call_cdce::execute (function *fun) > auto_vec<gcall *> cond_dead_built_in_calls; > FOR_EACH_BB_FN (bb, fun) > { > + /* Skip blocks that are being optimized for size, since our > + transformation always increases code size. */ > + if (optimize_bb_for_size_p (bb)) > + continue; > + > /* Collect dead call candidates. */ > for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i)) > { > gcall *stmt = dyn_cast <gcall *> (gsi_stmt (i)); > - if (stmt && is_call_dce_candidate (stmt)) > + if (stmt > + && gimple_call_builtin_p (stmt, BUILT_IN_NORMAL) > + && (gimple_call_lhs (stmt) > + ? can_use_internal_fn (stmt) > + : can_test_argument_range (stmt))) > { > if (dump_file && (dump_flags & TDF_DETAILS)) > { >
diff --git a/gcc/builtins.c b/gcc/builtins.c index c422d0d..df5c493 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -101,9 +101,6 @@ static rtx expand_builtin_apply (rtx, rtx, rtx); static void expand_builtin_return (rtx); static enum type_class type_to_class (tree); static rtx expand_builtin_classify_type (tree); -static void expand_errno_check (tree, rtx); -static rtx expand_builtin_mathfn (tree, rtx, rtx); -static rtx expand_builtin_mathfn_2 (tree, rtx, rtx); static rtx expand_builtin_mathfn_3 (tree, rtx, rtx); static rtx expand_builtin_mathfn_ternary (tree, rtx, rtx); static rtx expand_builtin_interclass_mathfn (tree, rtx); @@ -1972,286 +1969,6 @@ replacement_internal_fn (gcall *call) return IFN_LAST; } -/* If errno must be maintained, expand the RTL to check if the result, - TARGET, of a built-in function call, EXP, is NaN, and if so set - errno to EDOM. */ - -static void -expand_errno_check (tree exp, rtx target) -{ - rtx_code_label *lab = gen_label_rtx (); - - /* Test the result; if it is NaN, set errno=EDOM because - the argument was not in the domain. */ - do_compare_rtx_and_jump (target, target, EQ, 0, GET_MODE (target), - NULL_RTX, NULL, lab, - /* The jump is very likely. */ - REG_BR_PROB_BASE - (REG_BR_PROB_BASE / 2000 - 1)); - -#ifdef TARGET_EDOM - /* If this built-in doesn't throw an exception, set errno directly. */ - if (TREE_NOTHROW (TREE_OPERAND (CALL_EXPR_FN (exp), 0))) - { -#ifdef GEN_ERRNO_RTX - rtx errno_rtx = GEN_ERRNO_RTX; -#else - rtx errno_rtx - = gen_rtx_MEM (word_mode, gen_rtx_SYMBOL_REF (Pmode, "errno")); -#endif - emit_move_insn (errno_rtx, - gen_int_mode (TARGET_EDOM, GET_MODE (errno_rtx))); - emit_label (lab); - return; - } -#endif - - /* Make sure the library call isn't expanded as a tail call. */ - CALL_EXPR_TAILCALL (exp) = 0; - - /* We can't set errno=EDOM directly; let the library call do it. - Pop the arguments right away in case the call gets deleted. */ - NO_DEFER_POP; - expand_call (exp, target, 0); - OK_DEFER_POP; - emit_label (lab); -} - -/* Expand a call to one of the builtin math functions (sqrt, exp, or log). - Return NULL_RTX if a normal call should be emitted rather than expanding - the function in-line. EXP is the expression that is a call to the builtin - function; if convenient, the result should be placed in TARGET. - SUBTARGET may be used as the target for computing one of EXP's operands. */ - -static rtx -expand_builtin_mathfn (tree exp, rtx target, rtx subtarget) -{ - optab builtin_optab; - rtx op0; - rtx_insn *insns; - tree fndecl = get_callee_fndecl (exp); - machine_mode mode; - bool errno_set = false; - bool try_widening = false; - tree arg; - - if (!validate_arglist (exp, REAL_TYPE, VOID_TYPE)) - return NULL_RTX; - - arg = CALL_EXPR_ARG (exp, 0); - - switch (DECL_FUNCTION_CODE (fndecl)) - { - CASE_FLT_FN (BUILT_IN_SQRT): - errno_set = ! tree_expr_nonnegative_p (arg); - try_widening = true; - builtin_optab = sqrt_optab; - break; - CASE_FLT_FN (BUILT_IN_EXP): - errno_set = true; builtin_optab = exp_optab; break; - CASE_FLT_FN (BUILT_IN_EXP10): - CASE_FLT_FN (BUILT_IN_POW10): - errno_set = true; builtin_optab = exp10_optab; break; - CASE_FLT_FN (BUILT_IN_EXP2): - errno_set = true; builtin_optab = exp2_optab; break; - CASE_FLT_FN (BUILT_IN_EXPM1): - errno_set = true; builtin_optab = expm1_optab; break; - CASE_FLT_FN (BUILT_IN_LOGB): - errno_set = true; builtin_optab = logb_optab; break; - CASE_FLT_FN (BUILT_IN_LOG): - errno_set = true; builtin_optab = log_optab; break; - CASE_FLT_FN (BUILT_IN_LOG10): - errno_set = true; builtin_optab = log10_optab; break; - CASE_FLT_FN (BUILT_IN_LOG2): - errno_set = true; builtin_optab = log2_optab; break; - CASE_FLT_FN (BUILT_IN_LOG1P): - errno_set = true; builtin_optab = log1p_optab; break; - CASE_FLT_FN (BUILT_IN_ASIN): - builtin_optab = asin_optab; break; - CASE_FLT_FN (BUILT_IN_ACOS): - builtin_optab = acos_optab; break; - CASE_FLT_FN (BUILT_IN_TAN): - builtin_optab = tan_optab; break; - CASE_FLT_FN (BUILT_IN_ATAN): - builtin_optab = atan_optab; break; - CASE_FLT_FN (BUILT_IN_FLOOR): - builtin_optab = floor_optab; break; - CASE_FLT_FN (BUILT_IN_CEIL): - builtin_optab = ceil_optab; break; - CASE_FLT_FN (BUILT_IN_TRUNC): - builtin_optab = btrunc_optab; break; - CASE_FLT_FN (BUILT_IN_ROUND): - builtin_optab = round_optab; break; - CASE_FLT_FN (BUILT_IN_NEARBYINT): - builtin_optab = nearbyint_optab; - if (flag_trapping_math) - break; - /* Else fallthrough and expand as rint. */ - CASE_FLT_FN (BUILT_IN_RINT): - builtin_optab = rint_optab; break; - CASE_FLT_FN (BUILT_IN_SIGNIFICAND): - builtin_optab = significand_optab; break; - default: - gcc_unreachable (); - } - - /* Make a suitable register to place result in. */ - mode = TYPE_MODE (TREE_TYPE (exp)); - - if (! flag_errno_math || ! HONOR_NANS (mode)) - errno_set = false; - - /* Before working hard, check whether the instruction is available, but try - to widen the mode for specific operations. */ - if ((optab_handler (builtin_optab, mode) != CODE_FOR_nothing - || (try_widening && !excess_precision_type (TREE_TYPE (exp)))) - && (!errno_set || !optimize_insn_for_size_p ())) - { - rtx result = gen_reg_rtx (mode); - - /* Wrap the computation of the argument in a SAVE_EXPR, as we may - need to expand the argument again. This way, we will not perform - side-effects more the once. */ - CALL_EXPR_ARG (exp, 0) = arg = builtin_save_expr (arg); - - op0 = expand_expr (arg, subtarget, VOIDmode, EXPAND_NORMAL); - - start_sequence (); - - /* Compute into RESULT. - Set RESULT to wherever the result comes back. */ - result = expand_unop (mode, builtin_optab, op0, result, 0); - - if (result != 0) - { - if (errno_set) - expand_errno_check (exp, result); - - /* Output the entire sequence. */ - insns = get_insns (); - end_sequence (); - emit_insn (insns); - return result; - } - - /* If we were unable to expand via the builtin, stop the sequence - (without outputting the insns) and call to the library function - with the stabilized argument list. */ - end_sequence (); - } - - return expand_call (exp, target, target == const0_rtx); -} - -/* Expand a call to the builtin binary math functions (pow and atan2). - Return NULL_RTX if a normal call should be emitted rather than expanding the - function in-line. EXP is the expression that is a call to the builtin - function; if convenient, the result should be placed in TARGET. - SUBTARGET may be used as the target for computing one of EXP's - operands. */ - -static rtx -expand_builtin_mathfn_2 (tree exp, rtx target, rtx subtarget) -{ - optab builtin_optab; - rtx op0, op1, result; - rtx_insn *insns; - int op1_type = REAL_TYPE; - tree fndecl = get_callee_fndecl (exp); - tree arg0, arg1; - machine_mode mode; - bool errno_set = true; - - switch (DECL_FUNCTION_CODE (fndecl)) - { - CASE_FLT_FN (BUILT_IN_SCALBN): - CASE_FLT_FN (BUILT_IN_SCALBLN): - CASE_FLT_FN (BUILT_IN_LDEXP): - op1_type = INTEGER_TYPE; - default: - break; - } - - if (!validate_arglist (exp, REAL_TYPE, op1_type, VOID_TYPE)) - return NULL_RTX; - - arg0 = CALL_EXPR_ARG (exp, 0); - arg1 = CALL_EXPR_ARG (exp, 1); - - switch (DECL_FUNCTION_CODE (fndecl)) - { - CASE_FLT_FN (BUILT_IN_POW): - builtin_optab = pow_optab; break; - CASE_FLT_FN (BUILT_IN_ATAN2): - builtin_optab = atan2_optab; break; - CASE_FLT_FN (BUILT_IN_SCALB): - if (REAL_MODE_FORMAT (TYPE_MODE (TREE_TYPE (exp)))->b != 2) - return 0; - builtin_optab = scalb_optab; break; - CASE_FLT_FN (BUILT_IN_SCALBN): - CASE_FLT_FN (BUILT_IN_SCALBLN): - if (REAL_MODE_FORMAT (TYPE_MODE (TREE_TYPE (exp)))->b != 2) - return 0; - /* Fall through... */ - CASE_FLT_FN (BUILT_IN_LDEXP): - builtin_optab = ldexp_optab; break; - CASE_FLT_FN (BUILT_IN_FMOD): - builtin_optab = fmod_optab; break; - CASE_FLT_FN (BUILT_IN_REMAINDER): - CASE_FLT_FN (BUILT_IN_DREM): - builtin_optab = remainder_optab; break; - default: - gcc_unreachable (); - } - - /* Make a suitable register to place result in. */ - mode = TYPE_MODE (TREE_TYPE (exp)); - - /* Before working hard, check whether the instruction is available. */ - if (optab_handler (builtin_optab, mode) == CODE_FOR_nothing) - return NULL_RTX; - - result = gen_reg_rtx (mode); - - if (! flag_errno_math || ! HONOR_NANS (mode)) - errno_set = false; - - if (errno_set && optimize_insn_for_size_p ()) - return 0; - - /* Always stabilize the argument list. */ - CALL_EXPR_ARG (exp, 0) = arg0 = builtin_save_expr (arg0); - CALL_EXPR_ARG (exp, 1) = arg1 = builtin_save_expr (arg1); - - op0 = expand_expr (arg0, subtarget, VOIDmode, EXPAND_NORMAL); - op1 = expand_normal (arg1); - - start_sequence (); - - /* Compute into RESULT. - Set RESULT to wherever the result comes back. */ - result = expand_binop (mode, builtin_optab, op0, op1, - result, 0, OPTAB_DIRECT); - - /* If we were unable to expand via the builtin, stop the sequence - (without outputting the insns) and call to the library function - with the stabilized argument list. */ - if (result == 0) - { - end_sequence (); - return expand_call (exp, target, target == const0_rtx); - } - - if (errno_set) - expand_errno_check (exp, result); - - /* Output the entire sequence. */ - insns = get_insns (); - end_sequence (); - emit_insn (insns); - - return result; -} - /* Expand a call to the builtin trinary math functions (fma). Return NULL_RTX if a normal call should be emitted rather than expanding the function in-line. EXP is the expression that is a call to the builtin @@ -5984,37 +5701,6 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode, CASE_FLT_FN (BUILT_IN_CABS): break; - CASE_FLT_FN (BUILT_IN_EXP): - CASE_FLT_FN (BUILT_IN_EXP10): - CASE_FLT_FN (BUILT_IN_POW10): - CASE_FLT_FN (BUILT_IN_EXP2): - CASE_FLT_FN (BUILT_IN_EXPM1): - CASE_FLT_FN (BUILT_IN_LOGB): - CASE_FLT_FN (BUILT_IN_LOG): - CASE_FLT_FN (BUILT_IN_LOG10): - CASE_FLT_FN (BUILT_IN_LOG2): - CASE_FLT_FN (BUILT_IN_LOG1P): - CASE_FLT_FN (BUILT_IN_TAN): - CASE_FLT_FN (BUILT_IN_ASIN): - CASE_FLT_FN (BUILT_IN_ACOS): - CASE_FLT_FN (BUILT_IN_ATAN): - CASE_FLT_FN (BUILT_IN_SIGNIFICAND): - /* Treat these like sqrt only if unsafe math optimizations are allowed, - because of possible accuracy problems. */ - if (! flag_unsafe_math_optimizations) - break; - CASE_FLT_FN (BUILT_IN_SQRT): - CASE_FLT_FN (BUILT_IN_FLOOR): - CASE_FLT_FN (BUILT_IN_CEIL): - CASE_FLT_FN (BUILT_IN_TRUNC): - CASE_FLT_FN (BUILT_IN_ROUND): - CASE_FLT_FN (BUILT_IN_NEARBYINT): - CASE_FLT_FN (BUILT_IN_RINT): - target = expand_builtin_mathfn (exp, target, subtarget); - if (target) - return target; - break; - CASE_FLT_FN (BUILT_IN_FMA): target = expand_builtin_mathfn_ternary (exp, target, subtarget); if (target) @@ -6061,23 +5747,6 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode, return target; break; - CASE_FLT_FN (BUILT_IN_ATAN2): - CASE_FLT_FN (BUILT_IN_LDEXP): - CASE_FLT_FN (BUILT_IN_SCALB): - CASE_FLT_FN (BUILT_IN_SCALBN): - CASE_FLT_FN (BUILT_IN_SCALBLN): - if (! flag_unsafe_math_optimizations) - break; - - CASE_FLT_FN (BUILT_IN_FMOD): - CASE_FLT_FN (BUILT_IN_REMAINDER): - CASE_FLT_FN (BUILT_IN_DREM): - CASE_FLT_FN (BUILT_IN_POW): - target = expand_builtin_mathfn_2 (exp, target, subtarget); - if (target) - return target; - break; - CASE_FLT_FN (BUILT_IN_CEXPI): target = expand_builtin_cexpi (exp, target); gcc_assert (target); diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c index f23d799..06c5d9e 100644 --- a/gcc/internal-fn.c +++ b/gcc/internal-fn.c @@ -2073,6 +2073,24 @@ expand_GOACC_REDUCTION (internal_fn, gcall *) gcc_unreachable (); } +/* Set errno to EDOM. */ + +static void +expand_SET_EDOM (internal_fn, gcall *) +{ +#ifdef TARGET_EDOM +#ifdef GEN_ERRNO_RTX + rtx errno_rtx = GEN_ERRNO_RTX; +#else + rtx errno_rtx = gen_rtx_MEM (word_mode, gen_rtx_SYMBOL_REF (Pmode, "errno")); +#endif + emit_move_insn (errno_rtx, + gen_int_mode (TARGET_EDOM, GET_MODE (errno_rtx))); +#else + gcc_unreachable (); +#endif +} + /* Expand a call to FN using the operands in STMT. FN has a single output operand and NARGS input operands. */ @@ -2217,6 +2235,18 @@ direct_internal_fn_supported_p (internal_fn fn, tree type) return direct_internal_fn_supported_p (fn, tree_pair (type, type)); } +/* Return true if IFN_SET_EDOM is supported. */ + +bool +set_edom_supported_p (void) +{ +#ifdef TARGET_EDOM + return true; +#else + return false; +#endif +} + #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) \ static void \ expand_##CODE (internal_fn fn, gcall *stmt) \ diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index bf8047a..825dba1 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -181,6 +181,10 @@ DEF_INTERNAL_FN (GOACC_LOOP, ECF_PURE | ECF_NOTHROW, NULL) /* OpenACC reduction abstraction. See internal-fn.h for usage. */ DEF_INTERNAL_FN (GOACC_REDUCTION, ECF_NOTHROW | ECF_LEAF, NULL) +/* Set errno to EDOM, if GCC knows how to do that directly for the + current target. */ +DEF_INTERNAL_FN (SET_EDOM, ECF_LEAF | ECF_NOTHROW, NULL) + #undef DEF_INTERNAL_INT_FN #undef DEF_INTERNAL_FLT_FN #undef DEF_INTERNAL_OPTAB_FN diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h index 5ee43b8..6cb123f 100644 --- a/gcc/internal-fn.h +++ b/gcc/internal-fn.h @@ -160,6 +160,7 @@ extern tree_pair direct_internal_fn_types (internal_fn, tree, tree *); extern tree_pair direct_internal_fn_types (internal_fn, gcall *); extern bool direct_internal_fn_supported_p (internal_fn, tree_pair); extern bool direct_internal_fn_supported_p (internal_fn, tree); +extern bool set_edom_supported_p (void); extern void expand_internal_call (gcall *); extern void expand_internal_call (internal_fn, gcall *); diff --git a/gcc/opts.c b/gcc/opts.c index be04cf5..4345cc8 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -478,6 +478,7 @@ static const struct default_options default_options_table[] = { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fmove_loop_invariants, NULL, 1 }, { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_pta, NULL, 1 }, { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fssa_phiopt, NULL, 1 }, + { OPT_LEVELS_1_PLUS, OPT_ftree_builtin_call_dce, NULL, 1 }, /* -O2 optimizations. */ { OPT_LEVELS_2_PLUS, OPT_finline_small_functions, NULL, 1 }, @@ -503,7 +504,6 @@ static const struct default_options default_options_table[] = REORDER_BLOCKS_ALGORITHM_STC }, { OPT_LEVELS_2_PLUS, OPT_freorder_functions, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_ftree_vrp, NULL, 1 }, - { OPT_LEVELS_2_PLUS, OPT_ftree_builtin_call_dce, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_ftree_pre, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_ftree_switch_conversion, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_fipa_cp, NULL, 1 }, diff --git a/gcc/tree-call-cdce.c b/gcc/tree-call-cdce.c index fbcc70b..75ef180 100644 --- a/gcc/tree-call-cdce.c +++ b/gcc/tree-call-cdce.c @@ -33,46 +33,77 @@ along with GCC; see the file COPYING3. If not see #include "gimple-iterator.h" #include "tree-cfg.h" #include "tree-into-ssa.h" +#include "builtins.h" +#include "internal-fn.h" -/* Conditional dead call elimination +/* This pass serves two closely-related purposes: - Some builtin functions can set errno on error conditions, but they - are otherwise pure. If the result of a call to such a function is - not used, the compiler can still not eliminate the call without - powerful interprocedural analysis to prove that the errno is not - checked. However, if the conditions under which the error occurs - are known, the compiler can conditionally dead code eliminate the - calls by shrink-wrapping the semi-dead calls into the error condition: + 1. It conditionally executes calls that set errno if (a) the result of + the call is unused and (b) a simple range check on the arguments can + detect most cases where errno does not need to be set. - built_in_call (args) - ==> - if (error_cond (args)) - built_in_call (args) + This is the "conditional dead-code elimination" that gave the pass + its original name, since the call is dead for most argument values. + The calls for which it helps are usually part of the C++ abstraction + penalty exposed after inlining. + + 2. It looks for calls to built-in functions that set errno and whose + result is used. It checks whether there is an associated internal + function that doesn't set errno and whether the target supports + that internal function. If so, the pass uses the internal function + to compute the result of the built-in function but still arranges + for errno to be set when necessary. There are two ways of setting + errno: + + a. by protecting the original call with the same argument checks as (1) + + b. by protecting the original call with a check that the result + of the internal function is not equal to itself (i.e. is NaN). + + (b) requires that NaNs are the only erroneous results. It is not + appropriate for functions like log, which returns ERANGE for zero + arguments. (b) is also likely to perform worse than (a) because it + requires the result to be calculated first. The pass therefore uses + (a) when it can and uses (b) as a fallback. + + For (b) the pass can replace the original call with a call to + IFN_SET_EDOM, if the target supports direct assignments to errno. + + In both cases, arguments that require errno to be set should occur + rarely in practice. Checks of the errno result should also be rare, + but the compiler would need powerful interprocedural analysis to + prove that errno is not checked. It's much easier to add argument + checks or result checks instead. + + An example of (1) is: - An actual simple example is : log (x); // Mostly dead call ==> if (__builtin_islessequal (x, 0)) log (x); With this change, call to log (x) is effectively eliminated, as - in majority of the cases, log won't be called with x out of + in the majority of the cases, log won't be called with x out of range. The branch is totally predictable, so the branch cost is low. + An example of (2) is: + + y = sqrt (x); + ==> + y = IFN_SQRT (x); + if (__builtin_isless (x, 0)) + sqrt (x); + + In the vast majority of cases we should then never need to call sqrt. + Note that library functions are not supposed to clear errno to zero without error. See IEEE Std 1003.1, section 2.3 Error Numbers, and section 7.5:3 of ISO/IEC 9899 (C99). The condition wrapping the builtin call is conservatively set to avoid too - aggressive (wrong) shrink wrapping. The optimization is called conditional - dead call elimination because the call is eliminated under the condition - that the input arguments would not lead to domain or range error (for - instance when x <= 0 for a log (x) call), however the chances that the error - condition is hit is very low (those builtin calls which are conditionally - dead are usually part of the C++ abstraction penalty exposed after - inlining). */ + aggressive (wrong) shrink wrapping. */ /* A structure for representing input domain of @@ -251,28 +282,15 @@ check_builtin_call (gcall *bcall) return check_target_format (arg); } -/* A helper function to determine if a builtin function call is a - candidate for conditional DCE. Returns true if the builtin call - is a candidate. */ +/* Return true if built-in function call CALL calls a math function + and if we know how to test the range of its arguments to detect _most_ + situations in which errno is not set. The test must err on the side + of treating non-erroneous values as potentially erroneous. */ static bool -is_call_dce_candidate (gcall *call) +can_test_argument_range (gcall *call) { - tree fn; - enum built_in_function fnc; - - /* Only potentially dead calls are considered. */ - if (gimple_call_lhs (call)) - return false; - - fn = gimple_call_fndecl (call); - if (!fn - || !DECL_BUILT_IN (fn) - || (DECL_BUILT_IN_CLASS (fn) != BUILT_IN_NORMAL)) - return false; - - fnc = DECL_FUNCTION_CODE (fn); - switch (fnc) + switch (DECL_FUNCTION_CODE (gimple_call_fndecl (call))) { /* Trig functions. */ CASE_FLT_FN (BUILT_IN_ACOS): @@ -306,6 +324,31 @@ is_call_dce_candidate (gcall *call) return false; } +/* Return true if CALL can produce a domain error (EDOM) but can never + produce a pole, range overflow or range underflow error (all ERANGE). + This means that we can tell whether a function would have set errno + by testing whether the result is a NaN. */ + +static bool +edom_only_function (gcall *call) +{ + switch (DECL_FUNCTION_CODE (gimple_call_fndecl (call))) + { + CASE_FLT_FN (BUILT_IN_ACOS): + CASE_FLT_FN (BUILT_IN_ASIN): + CASE_FLT_FN (BUILT_IN_ATAN): + CASE_FLT_FN (BUILT_IN_COS): + CASE_FLT_FN (BUILT_IN_SIGNIFICAND): + CASE_FLT_FN (BUILT_IN_SIN): + CASE_FLT_FN (BUILT_IN_SQRT): + CASE_FLT_FN (BUILT_IN_FMOD): + CASE_FLT_FN (BUILT_IN_REMAINDER): + return true; + + default: + return false; + } +} /* A helper function to generate gimple statements for one bound comparison, so that the built-in function is called whenever @@ -703,33 +746,24 @@ gen_shrink_wrap_conditions (gcall *bi_call, vec<gimple *> conds, /* Probability of the branch (to the call) is taken. */ #define ERR_PROB 0.01 -/* The function to shrink wrap a partially dead builtin call - whose return value is not used anywhere, but has to be kept - live due to potential error condition. Returns true if the - transformation actually happens. */ +/* Shrink-wrap BI_CALL so that it is only called when one of the NCONDS + conditions in CONDS is false. + + Return true on success, in which case the cfg will have been updated. */ static bool -shrink_wrap_one_built_in_call (gcall *bi_call) +shrink_wrap_one_built_in_call_with_conds (gcall *bi_call, vec <gimple *> conds, + unsigned int nconds) { gimple_stmt_iterator bi_call_bsi; basic_block bi_call_bb, join_tgt_bb, guard_bb; edge join_tgt_in_edge_from_call, join_tgt_in_edge_fall_thru; edge bi_call_in_edge0, guard_bb_in_edge; - unsigned tn_cond_stmts, nconds; + unsigned tn_cond_stmts; unsigned ci; gimple *cond_expr = NULL; gimple *cond_expr_start; - auto_vec<gimple *, 12> conds; - gen_shrink_wrap_conditions (bi_call, conds, &nconds); - - /* This can happen if the condition generator decides - it is not beneficial to do the transformation. Just - return false and do not do any transformation for - the call. */ - if (nconds == 0) - return false; - /* The cfg we want to create looks like this: [guard n-1] <- guard_bb (old block) @@ -868,6 +902,117 @@ shrink_wrap_one_built_in_call (gcall *bi_call) return true; } +/* Shrink-wrap BI_CALL so that it is only called when it might set errno + (but is always called if it would set errno). + + Return true on success, in which case the cfg will have been updated. */ + +static bool +shrink_wrap_one_built_in_call (gcall *bi_call) +{ + unsigned nconds = 0; + auto_vec<gimple *, 12> conds; + gen_shrink_wrap_conditions (bi_call, conds, &nconds); + /* This can happen if the condition generator decides + it is not beneficial to do the transformation. Just + return false and do not do any transformation for + the call. */ + if (nconds == 0) + return false; + return shrink_wrap_one_built_in_call_with_conds (bi_call, conds, nconds); +} + +/* Return true if built-in function call CALL could be implemented using + a combination of an internal function to compute the result and a + separate call to set errno. */ + +static bool +can_use_internal_fn (gcall *call) +{ + /* Only replace calls that set errno. */ + if (!gimple_vdef (call)) + return false; + + /* Punt if we can't conditionalize the call. */ + basic_block bb = gimple_bb (call); + if (stmt_ends_bb_p (call) && !find_fallthru_edge (bb->succs)) + return false; + + /* See whether there is an internal function for this built-in. */ + if (replacement_internal_fn (call) == IFN_LAST) + return false; + + /* See whether we can catch all cases where errno would be set, + while still avoiding the call in most cases. */ + if (!can_test_argument_range (call) + && !edom_only_function (call)) + return false; + + return true; +} + +/* Implement built-in function call CALL using an internal function. + Return true on success, in which case the cfg will have changed. */ + +static bool +use_internal_fn (gcall *call) +{ + unsigned nconds = 0; + auto_vec<gimple *, 12> conds; + gen_shrink_wrap_conditions (call, conds, &nconds); + if (nconds == 0 && !edom_only_function (call)) + return false; + + internal_fn ifn = replacement_internal_fn (call); + gcc_assert (ifn != IFN_LAST); + + /* Construct the new call, with the same arguments as the original one. */ + auto_vec <tree, 16> args; + unsigned int nargs = gimple_call_num_args (call); + for (unsigned int i = 0; i < nargs; ++i) + args.safe_push (gimple_call_arg (call, i)); + gcall *new_call = gimple_build_call_internal_vec (ifn, args); + gimple_set_location (new_call, gimple_location (call)); + + /* Transfer the LHS to the new call. */ + tree lhs = gimple_call_lhs (call); + gimple_call_set_lhs (new_call, lhs); + gimple_call_set_lhs (call, NULL_TREE); + SSA_NAME_DEF_STMT (lhs) = new_call; + + /* Insert the new call. */ + gimple_stmt_iterator gsi = gsi_for_stmt (call); + gsi_insert_before (&gsi, new_call, GSI_SAME_STMT); + + if (nconds == 0) + { + /* Skip the call if LHS == LHS. If we reach here, EDOM is the only + valid errno value and it is used iff the result is NaN. */ + conds.quick_push (gimple_build_cond (EQ_EXPR, lhs, lhs, + NULL_TREE, NULL_TREE)); + nconds++; + + /* Try replacing the original call with a direct assignment to + errno, via an internal function. */ + if (set_edom_supported_p () && !stmt_ends_bb_p (call)) + { + gimple_stmt_iterator gsi = gsi_for_stmt (call); + gcall *new_call = gimple_build_call_internal (IFN_SET_EDOM, 0); + gimple_set_vuse (new_call, gimple_vuse (call)); + gimple_set_vdef (new_call, gimple_vdef (call)); + SSA_NAME_DEF_STMT (gimple_vdef (new_call)) = new_call; + gimple_set_location (new_call, gimple_location (call)); + gsi_replace (&gsi, new_call, false); + call = new_call; + } + } + + if (!shrink_wrap_one_built_in_call_with_conds (call, conds, nconds)) + /* It's too late to back out now. */ + gcc_unreachable (); + return true; +} + /* The top level function for conditional dead code shrink wrapping transformation. */ @@ -884,7 +1029,10 @@ shrink_wrap_conditional_dead_built_in_calls (vec<gcall *> calls) for (; i < n ; i++) { gcall *bi_call = calls[i]; - changed |= shrink_wrap_one_built_in_call (bi_call); + if (gimple_call_lhs (bi_call)) + changed |= use_internal_fn (bi_call); + else + changed |= shrink_wrap_one_built_in_call (bi_call); } return changed; @@ -913,13 +1061,12 @@ public: {} /* opt_pass methods: */ - virtual bool gate (function *fun) + virtual bool gate (function *) { /* The limit constants used in the implementation assume IEEE floating point format. Other formats can be supported in the future if needed. */ - return flag_tree_builtin_call_dce != 0 - && optimize_function_for_speed_p (fun); + return flag_tree_builtin_call_dce != 0; } virtual unsigned int execute (function *); @@ -935,11 +1082,20 @@ pass_call_cdce::execute (function *fun) auto_vec<gcall *> cond_dead_built_in_calls; FOR_EACH_BB_FN (bb, fun) { + /* Skip blocks that are being optimized for size, since our + transformation always increases code size. */ + if (optimize_bb_for_size_p (bb)) + continue; + /* Collect dead call candidates. */ for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i)) { gcall *stmt = dyn_cast <gcall *> (gsi_stmt (i)); - if (stmt && is_call_dce_candidate (stmt)) + if (stmt + && gimple_call_builtin_p (stmt, BUILT_IN_NORMAL) + && (gimple_call_lhs (stmt) + ? can_use_internal_fn (stmt) + : can_test_argument_range (stmt))) { if (dump_file && (dump_flags & TDF_DETAILS)) {