Message ID | tencent_3B2442361B6DF87E046FB4E71D55715D6309@qq.com |
---|---|
State | New |
Headers | show |
Series | RISC-V: Add function multiversioning support | expand |
Could you add testcases? Also, could you splitted that into smaller patches to make it easier to review? On Sun, Oct 20, 2024 at 1:24 PM Yangyu Chen <cyy@cyyself.name> wrote: > > This patch adds support for function multi-versioning to the RISC-V > using the target_clones and target_versions attributes, which follow > the RISC-V C-API Docs [1] and the existing proposal about priority > syntax [2]. > > This patch copies many codes from commit 0cfde688e213 ("[aarch64] > Add function multiversioning support") and modifies them to fit the > RISC-V port. Some key differences are introduced in previously > submitted patches [3] and [4], commit [5] and [6]. > > To test this patch with the GLIBC dynamic loader, you should apply > patch [7] for GLIBC to ensure the dynamic loader will initialize > the gp register correctly. > > [1] https://github.com/riscv-non-isa/riscv-c-api-doc/blob/c6c5d6d9cf96b342293315a5dff3d25e96ef8191/src/c-api.adoc#__attribute__targetattr-string > [2] https://github.com/riscv-non-isa/riscv-c-api-doc/pull/85 > [3] https://patchwork.sourceware.org/project/gcc/patch/20241015181607.3689413-1-chenyangyu@isrc.iscas.ac.cn/ > [4] https://patchwork.sourceware.org/project/gcc/patch/20241005181920.2223674-1-chenyangyu@isrc.iscas.ac.cn/ > [5] https://github.com/cyyself/gcc/commit/618352340e1e78057d1e9864c87d8641a2522f80 > [6] https://github.com/cyyself/gcc/commit/30e011886299996b2e1f922b13ec4192bbde6e94 > [7] https://patchwork.sourceware.org/project/glibc/patch/tencent_71D182FBDA6E8E57B80731DD218D8D5C7C08@qq.com/ > > Co-Developed-by: Hank Chang <hank.chang@sifive.com> > > gcc/ChangeLog: > > * common/config/riscv/riscv-common.cc > (struct riscv_ext_bitmask_table_t): New struct. > (riscv_minimal_hwprobe_feature_bits): New function. > * config/riscv/riscv-protos.h > (riscv_option_valid_version_attribute_p): New function. > (riscv_process_target_attr): New function. > * config/riscv/riscv-subset.h > (riscv_minimal_hwprobe_feature_bits): New function. > * config/riscv/riscv-target-attr.cc > (riscv_target_attr_parser::handle_priority): New function. > (riscv_target_attr_parser::update_settings): Update priority > attribute and never free the arch string. > (riscv_process_one_target_attr): Add const qualifier to arg_str > and split arg_str with ';'. > (riscv_process_target_attr): New implementation which consumes > the const char *args instead of tree. > (riscv_option_valid_attribute_p): Reapply any target_version > attribute after target attribute. > (riscv_process_target_version_attr): New function. > * config/riscv/riscv.cc (riscv_can_inline_p): Refuse to inline > when callee is versioned but caller is not. > (parse_features_for_version): New function. > (compare_fmv_features): New function. > (riscv_compare_version_priority): New function. > (riscv_common_function_versions): New function. > (add_condition_to_bb): New function. > (dispatch_function_versions): New function. > (get_suffixed_assembler_name): New function. > (make_resolver_func): New function. > (riscv_mangle_decl_assembler_name): New function. > (riscv_generate_version_dispatcher_body): New function. > (riscv_get_function_versions_dispatcher): New function. > (TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P): Implement it. > (TARGET_OPTION_FUNCTION_VERSIONS): Implement it. > (TARGET_COMPARE_VERSION_PRIORITY): Implement it. > (TARGET_GENERATE_VERSION_DISPATCHER_BODY): Implement it. > (TARGET_GET_FUNCTION_VERSIONS_DISPATCHER): Implement it. > (TARGET_MANGLE_DECL_ASSEMBLER_NAME): Implement it. > * config/riscv/riscv.opt: Add TargetVariable riscv_fmv_priority. > * defaults.h (TARGET_CLONES_ATTR_SEPARATOR): Define new macro. > * multiple_target.cc (get_attr_str): Use > TARGET_CLONES_ATTR_SEPARATOR to separate attributes. > (separate_attrs): Likewise. > * config/riscv/riscv.h > (TARGET_CLONES_ATTR_SEPARATOR): Implement it. > (TARGET_HAS_FMV_TARGET_ATTRIBUTE): Implement it. > * config/riscv/feature_bits.h: New file. > --- > gcc/common/config/riscv/riscv-common.cc | 145 +++++ > gcc/config/riscv/feature_bits.h | 44 ++ > gcc/config/riscv/riscv-protos.h | 4 + > gcc/config/riscv/riscv-subset.h | 5 + > gcc/config/riscv/riscv-target-attr.cc | 210 +++++-- > gcc/config/riscv/riscv.cc | 757 ++++++++++++++++++++++++ > gcc/config/riscv/riscv.h | 7 + > gcc/config/riscv/riscv.opt | 3 + > gcc/defaults.h | 4 + > gcc/multiple_target.cc | 19 +- > 10 files changed, 1157 insertions(+), 41 deletions(-) > create mode 100644 gcc/config/riscv/feature_bits.h > > diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc > index 2adebe0b6f2..2fc21424ebe 100644 > --- a/gcc/common/config/riscv/riscv-common.cc > +++ b/gcc/common/config/riscv/riscv-common.cc > @@ -19,6 +19,7 @@ along with GCC; see the file COPYING3. If not see > > #include <sstream> > #include <vector> > +#include <queue> > > #define INCLUDE_STRING > #define INCLUDE_SET > @@ -1760,6 +1761,75 @@ static const riscv_ext_flag_table_t riscv_ext_flag_table[] = > {NULL, NULL, NULL, 0} > }; > > +/* Types for recording extension to RISC-V C-API bitmask. */ > +struct riscv_ext_bitmask_table_t { > + const char *ext; > + int groupid; > + int bit_position; > +}; > + > +/* Mapping table between extension to RISC-V C-API extension bitmask. > + This table should sort the extension by Linux hwprobe order to get the > + minimal feature bits. */ > +static const riscv_ext_bitmask_table_t riscv_ext_bitmask_table[] = > +{ > + {"i", 0, 8}, > + {"m", 0, 12}, > + {"a", 0, 0}, > + {"f", 0, 5}, > + {"d", 0, 3}, > + {"c", 0, 2}, > + {"v", 0, 21}, > + {"zba", 0, 27}, > + {"zbb", 0, 28}, > + {"zbs", 0, 33}, > + {"zicboz", 0, 37}, > + {"zbc", 0, 29}, > + {"zbkb", 0, 30}, > + {"zbkc", 0, 31}, > + {"zbkx", 0, 32}, > + {"zknd", 0, 41}, > + {"zkne", 0, 42}, > + {"zknh", 0, 43}, > + {"zksed", 0, 44}, > + {"zksh", 0, 45}, > + {"zkt", 0, 46}, > + {"zvbb", 0, 48}, > + {"zvbc", 0, 49}, > + {"zvkb", 0, 52}, > + {"zvkg", 0, 53}, > + {"zvkned", 0, 54}, > + {"zvknha", 0, 55}, > + {"zvknhb", 0, 56}, > + {"zvksed", 0, 57}, > + {"zvksh", 0, 58}, > + {"zvkt", 0, 59}, > + {"zfh", 0, 35}, > + {"zfhmin", 0, 36}, > + {"zihintntl", 0, 39}, > + {"zvfh", 0, 50}, > + {"zvfhmin", 0, 51}, > + {"zfa", 0, 34}, > + {"ztso", 0, 47}, > + {"zacas", 0, 26}, > + {"zicond", 0, 38}, > + {"zihintpause", 0, 40}, > + {"zve32x", 0, 60}, > + {"zve32f", 0, 61}, > + {"zve64x", 0, 62}, > + {"zve64f", 0, 63}, > + {"zve64d", 1, 0}, > + {"zimop", 1, 1}, > + {"zca", 1, 2}, > + {"zcb", 1, 3}, > + {"zcd", 1, 4}, > + {"zcf", 1, 5}, > + {"zcmop", 1, 6}, > + {"zawrs", 1, 7}, > + > + {NULL, -1, -1} > +}; > + > /* Apply SUBSET_LIST to OPTS if OPTS is not null. */ > > void > @@ -1826,6 +1896,81 @@ riscv_x_target_flags_isa_mask (void) > return mask; > } > > +/* Get the minimal feature bits in Linux hwprobe of the given ISA string. > + > + Used for generating Function Multi-Versioning (FMV) dispatcher for RISC-V. > + > + The minimal feature bits refer to using the earliest extension that appeared > + in the Linux hwprobe to support the specified ISA string. This ensures that > + older kernels, which may lack certain implied extensions, can still run the > + FMV dispatcher correctly. */ > + > +bool > +riscv_minimal_hwprobe_feature_bits (const char *isa, > + struct riscv_feature_bits *res, > + location_t loc) > +{ > + riscv_subset_list *subset_list; > + subset_list = riscv_subset_list::parse (isa, loc); > + if (!subset_list) > + return false; > + > + /* Initialize the result feature bits to zero. */ > + res->length = RISCV_FEATURE_BITS_LENGTH; > + for (int i = 0; i < RISCV_FEATURE_BITS_LENGTH; ++i) > + res->features[i] = 0; > + > + /* Use a std::set to record all visited implied extensions. */ > + std::set <std::string> implied_exts; > + > + /* Iterate through the extension bitmask table in Linux hwprobe order to get > + the minimal covered feature bits. Avoiding some sub-extensions which will > + be implied by the super-extensions like V implied Zve32x. */ > + const riscv_ext_bitmask_table_t *ext_bitmask_tab; > + for (ext_bitmask_tab = &riscv_ext_bitmask_table[0]; > + ext_bitmask_tab->ext; > + ++ext_bitmask_tab) > + { > + /* Skip the extension if it is not in the subset list or already implied > + by previous extension. */ > + if (subset_list->lookup (ext_bitmask_tab->ext) == NULL > + || implied_exts.count (ext_bitmask_tab->ext)) > + continue; > + > + res->features[ext_bitmask_tab->groupid] > + |= 1ULL << ext_bitmask_tab->bit_position; > + > + /* Find the sub-extension using BFS and set the corresponding bit. */ > + std::queue <const char *> search_q; > + search_q.push (ext_bitmask_tab->ext); > + > + while (!search_q.empty ()) > + { > + const char * search_ext = search_q.front (); > + search_q.pop (); > + > + /* Iterate through the implied extension table. */ > + const riscv_implied_info_t *implied_info; > + for (implied_info = &riscv_implied_info[0]; > + implied_info->ext; > + ++implied_info) > + { > + /* When the search extension matches the implied extension and > + the implied extension has not been visited, mark the implied > + extension in the implied_exts set and push it into the > + queue. */ > + if (implied_info->match (subset_list, search_ext) > + && implied_exts.count (implied_info->implied_ext) == 0) > + { > + implied_exts.insert (implied_info->implied_ext); > + search_q.push (implied_info->implied_ext); > + } > + } > + } > + } > + return true; > +} > + > /* Parse a RISC-V ISA string into an option mask. Must clear or set all arch > dependent mask bits, in case more than one -march string is passed. */ > > diff --git a/gcc/config/riscv/feature_bits.h b/gcc/config/riscv/feature_bits.h > new file mode 100644 > index 00000000000..19b7630e339 > --- /dev/null > +++ b/gcc/config/riscv/feature_bits.h > @@ -0,0 +1,44 @@ > +/* Definition of RISC-V feature bits corresponding to > + libgcc/config/riscv/feature_bits.c > + Copyright (C) 2024 Free Software Foundation, Inc. > + > +This file is part of GCC. > + > +GCC is free software; you can redistribute it and/or modify > +it under the terms of the GNU General Public License as published by > +the Free Software Foundation; either version 3, or (at your option) > +any later version. > + > +GCC is distributed in the hope that it will be useful, > +but WITHOUT ANY WARRANTY; without even the implied warranty of > +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +GNU General Public License for more details. > + > +You should have received a copy of the GNU General Public License > +along with GCC; see the file COPYING3. If not see > +<http://www.gnu.org/licenses/>. */ > + > +#ifndef GCC_RISCV_FEATURE_BITS_H > +#define GCC_RISCV_FEATURE_BITS_H > + > +#define RISCV_FEATURE_BITS_LENGTH 2 > + > +struct riscv_feature_bits { > + unsigned length; > + unsigned long long features[RISCV_FEATURE_BITS_LENGTH]; > +}; > + > +#define RISCV_VENDOR_FEATURE_BITS_LENGTH 1 > + > +struct riscv_vendor_feature_bits { > + unsigned length; > + unsigned long long features[RISCV_VENDOR_FEATURE_BITS_LENGTH]; > +}; > + > +struct riscv_cpu_model { > + unsigned mvendorid; > + unsigned long long marchid; > + unsigned long long mimpid; > +}; > + > +#endif /* GCC_RISCV_FEATURE_BITS_H */ > diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h > index d690162bb0c..a35316f1228 100644 > --- a/gcc/config/riscv/riscv-protos.h > +++ b/gcc/config/riscv/riscv-protos.h > @@ -799,6 +799,10 @@ extern bool riscv_use_divmod_expander (void); > void riscv_init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree, int); > extern bool > riscv_option_valid_attribute_p (tree, tree, tree, int); > +extern bool > +riscv_option_valid_version_attribute_p (tree, tree, tree, int); > +extern bool > +riscv_process_target_attr (const char *, location_t); > extern void > riscv_override_options_internal (struct gcc_options *); > extern void riscv_option_override (void); > diff --git a/gcc/config/riscv/riscv-subset.h b/gcc/config/riscv/riscv-subset.h > index 1914a5317d7..75008be6613 100644 > --- a/gcc/config/riscv/riscv-subset.h > +++ b/gcc/config/riscv/riscv-subset.h > @@ -22,6 +22,8 @@ along with GCC; see the file COPYING3. If not see > #ifndef GCC_RISCV_SUBSET_H > #define GCC_RISCV_SUBSET_H > > +#include "feature_bits.h" > + > #define RISCV_DONT_CARE_VERSION -1 > > /* Subset info. */ > @@ -120,6 +122,9 @@ public: > extern const riscv_subset_list *riscv_cmdline_subset_list (void); > extern void > riscv_set_arch_by_subset_list (riscv_subset_list *, struct gcc_options *); > +extern bool riscv_minimal_hwprobe_feature_bits (const char *, > + struct riscv_feature_bits *, > + location_t); > extern bool > riscv_ext_is_subset (struct cl_target_option *, struct cl_target_option *); > extern int riscv_x_target_flags_isa_mask (void); > diff --git a/gcc/config/riscv/riscv-target-attr.cc b/gcc/config/riscv/riscv-target-attr.cc > index bf14ade5ce0..6a28281b0c9 100644 > --- a/gcc/config/riscv/riscv-target-attr.cc > +++ b/gcc/config/riscv/riscv-target-attr.cc > @@ -30,6 +30,8 @@ along with GCC; see the file COPYING3. If not see > #include "diagnostic.h" > #include "opts.h" > #include "riscv-subset.h" > +#include "stringpool.h" > +#include "attribs.h" > > namespace { > class riscv_target_attr_parser > @@ -39,6 +41,7 @@ public: > : m_found_arch_p (false) > , m_found_tune_p (false) > , m_found_cpu_p (false) > + , m_found_priority_p (false) > , m_subset_list (nullptr) > , m_loc (loc) > , m_cpu_info (nullptr) > @@ -49,6 +52,7 @@ public: > bool handle_arch (const char *); > bool handle_cpu (const char *); > bool handle_tune (const char *); > + bool handle_priority (const char *); > > void update_settings (struct gcc_options *opts) const; > private: > @@ -58,10 +62,12 @@ private: > bool m_found_arch_p; > bool m_found_tune_p; > bool m_found_cpu_p; > + bool m_found_priority_p; > riscv_subset_list *m_subset_list; > location_t m_loc; > const riscv_cpu_info *m_cpu_info; > const char *m_tune; > + int m_priority; > }; > } > > @@ -80,7 +86,8 @@ struct riscv_attribute_info > static const struct riscv_attribute_info riscv_attributes[] > = {{"arch", &riscv_target_attr_parser::handle_arch}, > {"cpu", &riscv_target_attr_parser::handle_cpu}, > - {"tune", &riscv_target_attr_parser::handle_tune}}; > + {"tune", &riscv_target_attr_parser::handle_tune}, > + {"priority", &riscv_target_attr_parser::handle_priority}}; > > bool > riscv_target_attr_parser::parse_arch (const char *str) > @@ -210,6 +217,22 @@ riscv_target_attr_parser::handle_tune (const char *str) > return true; > } > > +bool > +riscv_target_attr_parser::handle_priority (const char *str) > +{ > + if (m_found_priority_p) > + error_at (m_loc, "%<target()%> attribute: priority appears more than once"); > + m_found_priority_p = true; > + > + if (sscanf (str, "%d", &m_priority) != 1) > + { > + error_at (m_loc, "%<target()%> attribute: invalid priority %qs", str); > + return false; > + } > + > + return true; > +} > + > void > riscv_target_attr_parser::update_settings (struct gcc_options *opts) const > { > @@ -217,10 +240,6 @@ riscv_target_attr_parser::update_settings (struct gcc_options *opts) const > { > std::string local_arch = m_subset_list->to_string (true); > const char* local_arch_str = local_arch.c_str (); > - struct cl_target_option *default_opts > - = TREE_TARGET_OPTION (target_option_default_node); > - if (opts->x_riscv_arch_string != default_opts->x_riscv_arch_string) > - free (CONST_CAST (void *, (const void *) opts->x_riscv_arch_string)); > opts->x_riscv_arch_string = xstrdup (local_arch_str); > > riscv_set_arch_by_subset_list (m_subset_list, opts); > @@ -236,13 +255,16 @@ riscv_target_attr_parser::update_settings (struct gcc_options *opts) const > if (m_cpu_info) > opts->x_riscv_tune_string = m_cpu_info->tune; > } > + > + if (m_found_priority_p) > + opts->x_riscv_fmv_priority = m_priority; > } > > /* Parse ARG_STR which contains the definition of one target attribute. > Show appropriate errors if any or return true if the attribute is valid. */ > > static bool > -riscv_process_one_target_attr (char *arg_str, > +riscv_process_one_target_attr (const char *arg_str, > location_t loc, > riscv_target_attr_parser &attr_parser) > { > @@ -271,6 +293,12 @@ riscv_process_one_target_attr (char *arg_str, > > arg[0] = '\0'; > ++arg; > + > + /* Skip splitter ';' if it exists. */ > + char *splitter = strchr (arg, ';'); > + if (splitter) > + splitter[0] = '\0'; > + > for (const auto &attr : riscv_attributes) > { > /* If the names don't match up, or the user has given an argument > @@ -304,35 +332,13 @@ num_occurrences_in_str (char c, char *str) > return res; > } > > -/* Parse the tree in ARGS that contains the target attribute information > +/* Parse the string ARGS that contains the target attribute information > and update the global target options space. */ > > -static bool > -riscv_process_target_attr (tree args, location_t loc) > +bool > +riscv_process_target_attr (const char *args, location_t loc) > { > - if (TREE_CODE (args) == TREE_LIST) > - { > - do > - { > - tree head = TREE_VALUE (args); > - if (head) > - { > - if (!riscv_process_target_attr (head, loc)) > - return false; > - } > - args = TREE_CHAIN (args); > - } while (args); > - > - return true; > - } > - > - if (TREE_CODE (args) != STRING_CST) > - { > - error_at (loc, "attribute %<target%> argument not a string"); > - return false; > - } > - > - size_t len = strlen (TREE_STRING_POINTER (args)); > + size_t len = strlen (args); > > /* No need to emit warning or error on empty string here, generic code already > handle this case. */ > @@ -341,9 +347,14 @@ riscv_process_target_attr (tree args, location_t loc) > return false; > } > > + if (strcmp (args, "default") == 0) > + { > + return true; > + } > + > std::unique_ptr<char[]> buf (new char[len+1]); > char *str_to_check = buf.get (); > - strcpy (str_to_check, TREE_STRING_POINTER (args)); > + strcpy (str_to_check, args); > > /* Used to catch empty spaces between semi-colons i.e. > attribute ((target ("attr1;;attr2"))). */ > @@ -366,7 +377,7 @@ riscv_process_target_attr (tree args, location_t loc) > if (num_attrs != num_semicolons + 1) > { > error_at (loc, "malformed %<target(\"%s\")%> attribute", > - TREE_STRING_POINTER (args)); > + args); > return false; > } > > @@ -376,6 +387,37 @@ riscv_process_target_attr (tree args, location_t loc) > return true; > } > > +/* Parse the tree in ARGS that contains the target attribute information > + and update the global target options space. */ > + > +static bool > +riscv_process_target_attr (tree args, location_t loc) > +{ > + if (TREE_CODE (args) == TREE_LIST) > + { > + do > + { > + tree head = TREE_VALUE (args); > + if (head) > + { > + if (!riscv_process_target_attr (head, loc)) > + return false; > + } > + args = TREE_CHAIN (args); > + } while (args); > + > + return true; > + } > + > + if (TREE_CODE (args) != STRING_CST) > + { > + error_at (loc, "attribute %<target%> argument not a string"); > + return false; > + } > + > + return riscv_process_target_attr (TREE_STRING_POINTER (args), loc); > +} > + > /* Implement TARGET_OPTION_VALID_ATTRIBUTE_P. > This is used to process attribute ((target ("..."))). > Note, that riscv_set_current_function() has not been called before, > @@ -412,6 +454,20 @@ riscv_option_valid_attribute_p (tree fndecl, tree, tree args, int) > > /* Now we can parse the attributes and set &global_options accordingly. */ > ret = riscv_process_target_attr (args, loc); > + if (ret) > + { > + tree version_attr = lookup_attribute ("target_version", > + DECL_ATTRIBUTES (fndecl)); > + if (version_attr != NULL_TREE) > + { > + // Reapply any target_version attribute after target attribute. > + // This should be equivalent to applying the target_version once > + // after processing all target attributes. > + tree version_args = TREE_VALUE (version_attr); > + riscv_process_target_attr (version_args, > + DECL_SOURCE_LOCATION (fndecl)); > + } > + } > if (ret) > { > riscv_override_options_internal (&global_options); > @@ -424,3 +480,89 @@ riscv_option_valid_attribute_p (tree fndecl, tree, tree args, int) > cl_target_option_restore (&global_options, &global_options_set, &cur_target); > return ret; > } > + > +/* Parse the tree in ARGS that contains the target_version attribute > + information and update the global target options space. */ > + > +static bool > +riscv_process_target_version_attr (tree args, location_t loc) > +{ > + if (TREE_CODE (args) == TREE_LIST) > + { > + if (TREE_CHAIN (args)) > + { > + error ("attribute %<target_version%> has multiple values"); > + return false; > + } > + args = TREE_VALUE (args); > + } > + > + if (!args || TREE_CODE (args) != STRING_CST) > + { > + error ("attribute %<target_version%> argument not a string"); > + return false; > + } > + > + const char *str = TREE_STRING_POINTER (args); > + if (strcmp (str, "default") == 0) > + return true; > + > + riscv_target_attr_parser attr_parser (loc); > + if (!riscv_process_one_target_attr (str, loc, attr_parser)) > + return false; > + > + attr_parser.update_settings (&global_options); > + return true; > +} > + > + > +/* Implement TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P. This is used to > + process attribute ((target_version ("..."))). */ > + > +bool > +riscv_option_valid_version_attribute_p (tree fndecl, tree, tree args, int) > +{ > + struct cl_target_option cur_target; > + bool ret; > + tree new_target; > + tree existing_target = DECL_FUNCTION_SPECIFIC_TARGET (fndecl); > + location_t loc = DECL_SOURCE_LOCATION (fndecl); > + > + /* Save the current target options to restore at the end. */ > + cl_target_option_save (&cur_target, &global_options, &global_options_set); > + > + /* If fndecl already has some target attributes applied to it, unpack > + them so that we add this attribute on top of them, rather than > + overwriting them. */ > + if (existing_target) > + { > + struct cl_target_option *existing_options > + = TREE_TARGET_OPTION (existing_target); > + > + if (existing_options) > + cl_target_option_restore (&global_options, &global_options_set, > + existing_options); > + } > + else > + cl_target_option_restore (&global_options, &global_options_set, > + TREE_TARGET_OPTION (target_option_current_node)); > + > + ret = riscv_process_target_attr (args, loc); > + > + /* Set up any additional state. */ > + if (ret) > + { > + riscv_override_options_internal (&global_options); > + new_target = build_target_option_node (&global_options, > + &global_options_set); > + } > + else > + new_target = NULL; > + > + if (fndecl && ret) > + DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = new_target; > + > + cl_target_option_restore (&global_options, &global_options_set, &cur_target); > + > + return ret; > +} > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > index 3ac40234345..f091d022e43 100644 > --- a/gcc/config/riscv/riscv.cc > +++ b/gcc/config/riscv/riscv.cc > @@ -77,6 +77,9 @@ along with GCC; see the file COPYING3. If not see > #include "tree-dfa.h" > #include "target-globals.h" > #include "riscv-v.h" > +#include "cgraph.h" > +#include "langhooks.h" > +#include "gimplify.h" > > /* This file should be included last. */ > #include "target-def.h" > @@ -7666,6 +7669,10 @@ riscv_compute_frame_info (void) > static bool > riscv_can_inline_p (tree caller, tree callee) > { > + /* Do not inline when callee is versioned but caller is not. */ > + if (DECL_FUNCTION_VERSIONED (callee) && ! DECL_FUNCTION_VERSIONED (caller)) > + return false; > + > tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee); > tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller); > > @@ -12574,6 +12581,735 @@ riscv_c_mode_for_floating_type (enum tree_index ti) > return default_mode_for_floating_type (ti); > } > > +/* This parses the attribute arguments to target_version in DECL and modifies > + the feature mask and priority required to select those targets. */ > +static void > +parse_features_for_version (tree decl, > + struct riscv_feature_bits &res, > + int &priority) > +{ > + tree version_attr = lookup_attribute ("target_version", > + DECL_ATTRIBUTES (decl)); > + if (version_attr == NULL_TREE) > + { > + res.length = 0; > + priority = 0; > + return; > + } > + > + const char *version_string = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE > + (version_attr))); > + gcc_assert (version_string != NULL); > + if (strcmp (version_string, "default") == 0) > + { > + res.length = 0; > + priority = 0; > + return; > + } > + struct cl_target_option cur_target; > + cl_target_option_save (&cur_target, &global_options, > + &global_options_set); > + /* Always set to default option before parsing "arch=+..." */ > + struct cl_target_option *default_opts > + = TREE_TARGET_OPTION (target_option_default_node); > + cl_target_option_restore (&global_options, &global_options_set, > + default_opts); > + > + riscv_process_target_attr (version_string, > + DECL_SOURCE_LOCATION (decl)); > + > + priority = global_options.x_riscv_fmv_priority; > + const char *arch_string = global_options.x_riscv_arch_string; > + bool parse_res > + = riscv_minimal_hwprobe_feature_bits (arch_string, &res, > + DECL_SOURCE_LOCATION (decl)); > + gcc_assert (parse_res); > + > + if (arch_string != default_opts->x_riscv_arch_string) > + free (CONST_CAST (void *, (const void *) arch_string)); > + > + cl_target_option_restore (&global_options, &global_options_set, > + &cur_target); > +} > + > +/* Compare priorities of two feature masks. Return: > + 1: mask1 is higher priority > + -1: mask2 is higher priority > + 0: masks are equal. > + Since riscv_feature_bits has total 128 bits to be used as mask, when counting > + the total 1s in the mask, the 1s in group1 needs to multiply a weight. */ > +static int > +compare_fmv_features (const struct riscv_feature_bits &mask1, > + const struct riscv_feature_bits &mask2, > + int prio1, int prio2) > +{ > + unsigned length1 = mask1.length, length2 = mask2.length; > + /* 1. Compare length, for length == 0 means default version, which should be > + the lowest priority). */ > + if (length1 != length2) > + return length1 > length2 ? 1 : -1; > + /* 2. Compare the priority. */ > + if (prio1 != prio2) > + return prio1 > prio2 ? 1 : -1; > + /* 3. Compare the total number of 1s in the mask. */ > + unsigned pop1 = 0, pop2 = 0; > + for (int i = 0; i < length1; i++) > + { > + pop1 += __builtin_popcountll (mask1.features[i]); > + pop2 += __builtin_popcountll (mask2.features[i]); > + } > + if (pop1 != pop2) > + return pop1 > pop2 ? 1 : -1; > + /* 4. Compare the mask bit by bit order. */ > + for (int i = 0; i < length1; i++) > + { > + unsigned long long xor_mask = mask1.features[i] ^ mask2.features[i]; > + if (xor_mask == 0) > + continue; > + return TEST_BIT (mask1.features[i], __builtin_ctzll (xor_mask)) ? 1 : -1; > + } > + /* 5. If all bits are equal, return 0. */ > + return 0; > +} > + > +/* Compare priorities of two version decls. Return: > + 1: mask1 is higher priority > + -1: mask2 is higher priority > + 0: masks are equal. */ > +int > +riscv_compare_version_priority (tree decl1, tree decl2) > +{ > + struct riscv_feature_bits mask1, mask2; > + int prio1, prio2; > + > + parse_features_for_version (decl1, mask1, prio1); > + parse_features_for_version (decl2, mask2, prio2); > + > + return compare_fmv_features (mask1, mask2, prio1, prio2); > +} > + > + > +/* This function returns true if FN1 and FN2 are versions of the same function, > + that is, the target_version attributes of the function decls are different. > + This assumes that FN1 and FN2 have the same signature. */ > + > +bool > +riscv_common_function_versions (tree fn1, tree fn2) > +{ > + if (TREE_CODE (fn1) != FUNCTION_DECL > + || TREE_CODE (fn2) != FUNCTION_DECL) > + return false; > + > + return riscv_compare_version_priority (fn1, fn2) != 0; > +} > + > +/* This adds a condition to the basic_block NEW_BB in function FUNCTION_DECL > + to return a pointer to VERSION_DECL if all feature bits specified in > + FEATURE_MASK are not set in MASK_VAR. This function will be called during > + version dispatch to decide which function version to execute. It returns > + the basic block at the end, to which more conditions can be added. */ > +static basic_block > +add_condition_to_bb (tree function_decl, tree version_decl, > + const struct riscv_feature_bits *features, > + tree mask_var, basic_block new_bb) > +{ > + gimple *return_stmt; > + tree convert_expr, result_var; > + gimple *convert_stmt; > + gimple *if_else_stmt; > + > + basic_block bb1, bb2, bb3; > + edge e12, e23; > + > + gimple_seq gseq; > + > + push_cfun (DECL_STRUCT_FUNCTION (function_decl)); > + > + gcc_assert (new_bb != NULL); > + gseq = bb_seq (new_bb); > + > + convert_expr = build1 (CONVERT_EXPR, ptr_type_node, > + build_fold_addr_expr (version_decl)); > + result_var = create_tmp_var (ptr_type_node); > + convert_stmt = gimple_build_assign (result_var, convert_expr); > + return_stmt = gimple_build_return (result_var); > + > + if (features->length == 0) > + { > + /* Default version. */ > + gimple_seq_add_stmt (&gseq, convert_stmt); > + gimple_seq_add_stmt (&gseq, return_stmt); > + set_bb_seq (new_bb, gseq); > + gimple_set_bb (convert_stmt, new_bb); > + gimple_set_bb (return_stmt, new_bb); > + pop_cfun (); > + return new_bb; > + } > + > + tree zero_llu = build_int_cst (long_long_unsigned_type_node, 0); > + tree cond_status = create_tmp_var (boolean_type_node); > + tree mask_array_ele_var = create_tmp_var (long_long_unsigned_type_node); > + tree and_expr_var = create_tmp_var (long_long_unsigned_type_node); > + tree eq_expr_var = create_tmp_var (boolean_type_node); > + > + /* cond_status = true. */ > + gimple *cond_init_stmt = gimple_build_assign (cond_status, boolean_true_node); > + gimple_set_block (cond_init_stmt, DECL_INITIAL (function_decl)); > + gimple_set_bb (cond_init_stmt, new_bb); > + gimple_seq_add_stmt (&gseq, cond_init_stmt); > + > + for (int i = 0; i < RISCV_FEATURE_BITS_LENGTH; i++) > + { > + tree index_expr = build_int_cst (unsigned_type_node, i); > + /* mask_array_ele_var = mask_var[i] */ > + tree mask_array_ref = build4 (ARRAY_REF, long_long_unsigned_type_node, > + mask_var, index_expr, NULL_TREE, NULL_TREE); > + > + gimple *mask_stmt = gimple_build_assign (mask_array_ele_var, > + mask_array_ref); > + gimple_set_block (mask_stmt, DECL_INITIAL (function_decl)); > + gimple_set_bb (mask_stmt, new_bb); > + gimple_seq_add_stmt (&gseq, mask_stmt); > + /* and_expr_var = mask_array_ele_var & features[i] */ > + tree and_expr = build2 (BIT_AND_EXPR, > + long_long_unsigned_type_node, > + mask_array_ele_var, > + build_int_cst (long_long_unsigned_type_node, > + features->features[i])); > + gimple *and_stmt = gimple_build_assign (and_expr_var, and_expr); > + gimple_set_block (and_stmt, DECL_INITIAL (function_decl)); > + gimple_set_bb (and_stmt, new_bb); > + gimple_seq_add_stmt (&gseq, and_stmt); > + /* eq_expr_var = and_expr_var == 0. */ > + tree eq_expr = build2 (EQ_EXPR, boolean_type_node, > + and_expr_var, zero_llu); > + gimple *eq_stmt = gimple_build_assign (eq_expr_var, eq_expr); > + gimple_set_block (eq_stmt, DECL_INITIAL (function_decl)); > + gimple_set_bb (eq_stmt, new_bb); > + gimple_seq_add_stmt (&gseq, eq_stmt); > + /* cond_status = cond_status & eq_expr_var. */ > + tree cond_expr = build2 (BIT_AND_EXPR, boolean_type_node, > + cond_status, eq_expr_var); > + gimple *cond_stmt = gimple_build_assign (cond_status, cond_expr); > + gimple_set_block (cond_stmt, DECL_INITIAL (function_decl)); > + gimple_set_bb (cond_stmt, new_bb); > + gimple_seq_add_stmt (&gseq, cond_stmt); > + } > + if_else_stmt = gimple_build_cond (EQ_EXPR, cond_status, boolean_true_node, > + NULL_TREE, NULL_TREE); > + gimple_set_block (if_else_stmt, DECL_INITIAL (function_decl)); > + gimple_set_bb (if_else_stmt, new_bb); > + gimple_seq_add_stmt (&gseq, if_else_stmt); > + > + gimple_seq_add_stmt (&gseq, convert_stmt); > + gimple_seq_add_stmt (&gseq, return_stmt); > + set_bb_seq (new_bb, gseq); > + > + bb1 = new_bb; > + e12 = split_block (bb1, if_else_stmt); > + bb2 = e12->dest; > + e12->flags &= ~EDGE_FALLTHRU; > + e12->flags |= EDGE_TRUE_VALUE; > + > + e23 = split_block (bb2, return_stmt); > + > + gimple_set_bb (convert_stmt, bb2); > + gimple_set_bb (return_stmt, bb2); > + > + bb3 = e23->dest; > + make_edge (bb1, bb3, EDGE_FALSE_VALUE); > + > + remove_edge (e23); > + make_edge (bb2, EXIT_BLOCK_PTR_FOR_FN (cfun), 0); > + > + pop_cfun (); > + > + return bb3; > +} > + > +/* This function generates the dispatch function for > + multi-versioned functions. DISPATCH_DECL is the function which will > + contain the dispatch logic. FNDECLS are the function choices for > + dispatch, and is a tree chain. EMPTY_BB is the basic block pointer > + in DISPATCH_DECL in which the dispatch code is generated. */ > + > +static int > +dispatch_function_versions (tree dispatch_decl, > + void *fndecls_p, > + basic_block *empty_bb) > +{ > + gimple *ifunc_cpu_init_stmt; > + gimple_seq gseq; > + vec<tree> *fndecls; > + > + gcc_assert (dispatch_decl != NULL > + && fndecls_p != NULL > + && empty_bb != NULL); > + > + push_cfun (DECL_STRUCT_FUNCTION (dispatch_decl)); > + > + gseq = bb_seq (*empty_bb); > + /* Function version dispatch is via IFUNC. IFUNC resolvers fire before > + constructors, so explicity call __init_riscv_feature_bits here. */ > + tree init_fn_type = build_function_type_list (void_type_node, > + long_unsigned_type_node, > + ptr_type_node, > + NULL); > + tree init_fn_id = get_identifier ("__init_riscv_feature_bits"); > + tree init_fn_decl = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL, > + init_fn_id, init_fn_type); > + ifunc_cpu_init_stmt = gimple_build_call (init_fn_decl, 0); > + gimple_seq_add_stmt (&gseq, ifunc_cpu_init_stmt); > + gimple_set_bb (ifunc_cpu_init_stmt, *empty_bb); > + > + /* Build the struct type for __riscv_feature_bits. */ > + tree global_type = lang_hooks.types.make_type (RECORD_TYPE); > + tree features_type = build_array_type_nelts (long_long_unsigned_type_node, > + RISCV_FEATURE_BITS_LENGTH); > + tree field1 = build_decl (UNKNOWN_LOCATION, FIELD_DECL, > + get_identifier ("length"), > + unsigned_type_node); > + tree field2 = build_decl (UNKNOWN_LOCATION, FIELD_DECL, > + get_identifier ("features"), > + features_type); > + DECL_FIELD_CONTEXT (field1) = global_type; > + DECL_FIELD_CONTEXT (field2) = global_type; > + TYPE_FIELDS (global_type) = field1; > + DECL_CHAIN (field1) = field2; > + layout_type (global_type); > + > + tree global_var = build_decl (UNKNOWN_LOCATION, VAR_DECL, > + get_identifier ("__riscv_feature_bits"), > + global_type); > + DECL_EXTERNAL (global_var) = 1; > + tree mask_var = create_tmp_var (features_type); > + tree feature_ele_var = create_tmp_var (long_long_unsigned_type_node); > + tree noted_var = create_tmp_var (long_long_unsigned_type_node); > + > + > + for (int i = 0; i < RISCV_FEATURE_BITS_LENGTH; i++) > + { > + tree index_expr = build_int_cst (unsigned_type_node, i); > + /* feature_ele_var = __riscv_feature_bits.features[i] */ > + tree component_expr = build3 (COMPONENT_REF, features_type, > + global_var, field2, NULL_TREE); > + tree feature_array_ref = build4 (ARRAY_REF, long_long_unsigned_type_node, > + component_expr, index_expr, > + NULL_TREE, NULL_TREE); > + gimple *feature_stmt = gimple_build_assign (feature_ele_var, > + feature_array_ref); > + gimple_set_block (feature_stmt, DECL_INITIAL (dispatch_decl)); > + gimple_set_bb (feature_stmt, *empty_bb); > + gimple_seq_add_stmt (&gseq, feature_stmt); > + /* noted_var = ~feature_ele_var. */ > + tree not_expr = build1 (BIT_NOT_EXPR, long_long_unsigned_type_node, > + feature_ele_var); > + gimple *not_stmt = gimple_build_assign (noted_var, not_expr); > + gimple_set_block (not_stmt, DECL_INITIAL (dispatch_decl)); > + gimple_set_bb (not_stmt, *empty_bb); > + gimple_seq_add_stmt (&gseq, not_stmt); > + /* mask_var[i] = ~feature_ele_var. */ > + tree mask_array_ref = build4 (ARRAY_REF, long_long_unsigned_type_node, > + mask_var, index_expr, NULL_TREE, NULL_TREE); > + gimple *mask_assign_stmt = gimple_build_assign (mask_array_ref, > + noted_var); > + gimple_set_block (mask_assign_stmt, DECL_INITIAL (dispatch_decl)); > + gimple_set_bb (mask_assign_stmt, *empty_bb); > + gimple_seq_add_stmt (&gseq, mask_assign_stmt); > + } > + > + set_bb_seq (*empty_bb, gseq); > + > + pop_cfun (); > + > + /* fndecls_p is actually a vector. */ > + fndecls = static_cast<vec<tree> *> (fndecls_p); > + > + /* At least one more version other than the default. */ > + unsigned int num_versions = fndecls->length (); > + gcc_assert (num_versions >= 2); > + > + struct function_version_info > + { > + tree version_decl; > + int prio; > + struct riscv_feature_bits features; > + } *function_versions; > + > + function_versions = (struct function_version_info *) > + XNEWVEC (struct function_version_info, (num_versions)); > + > + unsigned int actual_versions = 0; > + > + for (tree version_decl : *fndecls) > + { > + function_versions[actual_versions].version_decl = version_decl; > + // Get attribute string, parse it and find the right features. > + parse_features_for_version (version_decl, > + function_versions[actual_versions].features, > + function_versions[actual_versions].prio); > + actual_versions++; > + } > + > + > + auto compare_feature_version_info = [](const void *p1, const void *p2) > + { > + const function_version_info v1 = *(const function_version_info *)p1; > + const function_version_info v2 = *(const function_version_info *)p2; > + return - compare_fmv_features (v1.features, v2.features, > + v1.prio, v2.prio); > + }; > + > + /* Sort the versions according to descending order of dispatch priority. */ > + qsort (function_versions, actual_versions, > + sizeof (struct function_version_info), compare_feature_version_info); > + > + for (unsigned int i = 0; i < actual_versions; ++i) > + { > + *empty_bb = add_condition_to_bb (dispatch_decl, > + function_versions[i].version_decl, > + &function_versions[i].features, > + mask_var, > + *empty_bb); > + } > + > + free (function_versions); > + return 0; > +} > + > +/* Return an identifier for the base assembler name of a versioned function. > + This is computed by taking the default version's assembler name, and > + stripping off the ".default" suffix if it's already been appended. */ > + > +static tree > +get_suffixed_assembler_name (tree default_decl, const char *suffix) > +{ > + std::string name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (default_decl)); > + > + auto size = name.size (); > + if (size >= 8 && name.compare (size - 8, 8, ".default") == 0) > + name.resize (size - 8); > + name += suffix; > + return get_identifier (name.c_str ()); > +} > + > +/* Make the resolver function decl to dispatch the versions of > + a multi-versioned function, DEFAULT_DECL. IFUNC_ALIAS_DECL is > + ifunc alias that will point to the created resolver. Create an > + empty basic block in the resolver and store the pointer in > + EMPTY_BB. Return the decl of the resolver function. */ > + > +static tree > +make_resolver_func (const tree default_decl, > + const tree ifunc_alias_decl, > + basic_block *empty_bb) > +{ > + tree decl, type, t; > + > + /* Create resolver function name based on default_decl. We need to remove an > + existing ".default" suffix if this has already been appended. */ > + tree decl_name = get_suffixed_assembler_name (default_decl, ".resolver"); > + const char *resolver_name = IDENTIFIER_POINTER (decl_name); > + > + /* The resolver function should have signature > + (void *) resolver (uint64_t, void *) */ > + type = build_function_type_list (ptr_type_node, > + uint64_type_node, > + ptr_type_node, > + NULL_TREE); > + > + decl = build_fn_decl (resolver_name, type); > + SET_DECL_ASSEMBLER_NAME (decl, decl_name); > + > + DECL_NAME (decl) = decl_name; > + TREE_USED (decl) = 1; > + DECL_ARTIFICIAL (decl) = 1; > + DECL_IGNORED_P (decl) = 1; > + TREE_PUBLIC (decl) = 0; > + DECL_UNINLINABLE (decl) = 1; > + > + /* Resolver is not external, body is generated. */ > + DECL_EXTERNAL (decl) = 0; > + DECL_EXTERNAL (ifunc_alias_decl) = 0; > + > + DECL_CONTEXT (decl) = NULL_TREE; > + DECL_INITIAL (decl) = make_node (BLOCK); > + DECL_STATIC_CONSTRUCTOR (decl) = 0; > + > + if (DECL_COMDAT_GROUP (default_decl) > + || TREE_PUBLIC (default_decl)) > + { > + /* In this case, each translation unit with a call to this > + versioned function will put out a resolver. Ensure it > + is comdat to keep just one copy. */ > + DECL_COMDAT (decl) = 1; > + make_decl_one_only (decl, DECL_ASSEMBLER_NAME (decl)); > + } > + else > + TREE_PUBLIC (ifunc_alias_decl) = 0; > + > + /* Build result decl and add to function_decl. */ > + t = build_decl (UNKNOWN_LOCATION, RESULT_DECL, NULL_TREE, ptr_type_node); > + DECL_CONTEXT (t) = decl; > + DECL_ARTIFICIAL (t) = 1; > + DECL_IGNORED_P (t) = 1; > + DECL_RESULT (decl) = t; > + > + /* Build parameter decls and add to function_decl. */ > + tree arg1 = build_decl (UNKNOWN_LOCATION, PARM_DECL, > + get_identifier ("hwcap"), > + uint64_type_node); > + tree arg2 = build_decl (UNKNOWN_LOCATION, PARM_DECL, > + get_identifier ("hwprobe_func"), > + ptr_type_node); > + DECL_CONTEXT (arg1) = decl; > + DECL_CONTEXT (arg2) = decl; > + DECL_ARTIFICIAL (arg1) = 1; > + DECL_ARTIFICIAL (arg2) = 1; > + DECL_IGNORED_P (arg1) = 1; > + DECL_IGNORED_P (arg2) = 1; > + DECL_ARG_TYPE (arg1) = uint64_type_node; > + DECL_ARG_TYPE (arg2) = ptr_type_node; > + DECL_ARGUMENTS (decl) = arg1; > + TREE_CHAIN (arg1) = arg2; > + > + gimplify_function_tree (decl); > + push_cfun (DECL_STRUCT_FUNCTION (decl)); > + *empty_bb = init_lowered_empty_function (decl, false, > + profile_count::uninitialized ()); > + > + cgraph_node::add_new_function (decl, true); > + symtab->call_cgraph_insertion_hooks (cgraph_node::get_create (decl)); > + > + pop_cfun (); > + > + gcc_assert (ifunc_alias_decl != NULL); > + /* Mark ifunc_alias_decl as "ifunc" with resolver as resolver_name. */ > + DECL_ATTRIBUTES (ifunc_alias_decl) > + = make_attribute ("ifunc", resolver_name, > + DECL_ATTRIBUTES (ifunc_alias_decl)); > + > + /* Create the alias for dispatch to resolver here. */ > + cgraph_node::create_same_body_alias (ifunc_alias_decl, decl); > + return decl; > +} > + > +/* Implement TARGET_MANGLE_DECL_ASSEMBLER_NAME, to add function multiversioning > + suffixes. */ > + > +tree > +riscv_mangle_decl_assembler_name (tree decl, tree id) > +{ > + /* For function version, add the target suffix to the assembler name. */ > + if (TREE_CODE (decl) == FUNCTION_DECL > + && DECL_FUNCTION_VERSIONED (decl)) > + { > + std::string name = IDENTIFIER_POINTER (id) + std::string ("."); > + tree target_attr = lookup_attribute ("target_version", > + DECL_ATTRIBUTES (decl)); > + > + if (target_attr == NULL_TREE) > + { > + name += "default"; > + return get_identifier (name.c_str ()); > + } > + > + const char *version_string = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE > + (target_attr))); > + > + /* Replace non-alphanumeric characters with underscores as the suffix. */ > + for (const char *c = version_string; *c; c++) > + name += ISALNUM (*c) == 0 ? '_' : *c; > + > + if (DECL_ASSEMBLER_NAME_SET_P (decl)) > + SET_DECL_RTL (decl, NULL); > + > + id = get_identifier (name.c_str ()); > + } > + > + return id; > +} > + > +/* Implement TARGET_GENERATE_VERSION_DISPATCHER_BODY. */ > + > +tree > +riscv_generate_version_dispatcher_body (void *node_p) > +{ > + tree resolver_decl; > + basic_block empty_bb; > + tree default_ver_decl; > + struct cgraph_node *versn; > + struct cgraph_node *node; > + > + struct cgraph_function_version_info *node_version_info = NULL; > + struct cgraph_function_version_info *versn_info = NULL; > + > + node = (cgraph_node *)node_p; > + > + node_version_info = node->function_version (); > + gcc_assert (node->dispatcher_function > + && node_version_info != NULL); > + > + if (node_version_info->dispatcher_resolver) > + return node_version_info->dispatcher_resolver; > + > + /* The first version in the chain corresponds to the default version. */ > + default_ver_decl = node_version_info->next->this_node->decl; > + > + /* node is going to be an alias, so remove the finalized bit. */ > + node->definition = false; > + > + resolver_decl = make_resolver_func (default_ver_decl, > + node->decl, &empty_bb); > + > + node_version_info->dispatcher_resolver = resolver_decl; > + > + push_cfun (DECL_STRUCT_FUNCTION (resolver_decl)); > + > + auto_vec<tree, 2> fn_ver_vec; > + > + for (versn_info = node_version_info->next; versn_info; > + versn_info = versn_info->next) > + { > + versn = versn_info->this_node; > + /* Check for virtual functions here again, as by this time it should > + have been determined if this function needs a vtable index or > + not. This happens for methods in derived classes that override > + virtual methods in base classes but are not explicitly marked as > + virtual. */ > + if (DECL_VINDEX (versn->decl)) > + sorry ("virtual function multiversioning not supported"); > + > + fn_ver_vec.safe_push (versn->decl); > + } > + > + dispatch_function_versions (resolver_decl, &fn_ver_vec, &empty_bb); > + cgraph_edge::rebuild_edges (); > + pop_cfun (); > + > + /* Fix up symbol names. First we need to obtain the base name, which may > + have already been mangled. */ > + tree base_name = get_suffixed_assembler_name (default_ver_decl, ""); > + > + /* We need to redo the version mangling on the non-default versions for the > + target_clones case. Redoing the mangling for the target_version case is > + redundant but does no harm. We need to skip the default version, because > + expand_clones will append ".default" later; fortunately that suffix is the > + one we want anyway. */ > + for (versn_info = node_version_info->next->next; versn_info; > + versn_info = versn_info->next) > + { > + tree version_decl = versn_info->this_node->decl; > + tree name = riscv_mangle_decl_assembler_name (version_decl, > + base_name); > + symtab->change_decl_assembler_name (version_decl, name); > + } > + > + /* We also need to use the base name for the ifunc declaration. */ > + symtab->change_decl_assembler_name (node->decl, base_name); > + > + return resolver_decl; > +} > + > +/* Make a dispatcher declaration for the multi-versioned function DECL. > + Calls to DECL function will be replaced with calls to the dispatcher > + by the front-end. Returns the decl of the dispatcher function. */ > + > +tree > +riscv_get_function_versions_dispatcher (void *decl) > +{ > + tree fn = (tree) decl; > + struct cgraph_node *node = NULL; > + struct cgraph_node *default_node = NULL; > + struct cgraph_function_version_info *node_v = NULL; > + struct cgraph_function_version_info *first_v = NULL; > + > + tree dispatch_decl = NULL; > + > + struct cgraph_function_version_info *default_version_info = NULL; > + > + gcc_assert (fn != NULL && DECL_FUNCTION_VERSIONED (fn)); > + > + node = cgraph_node::get (fn); > + gcc_assert (node != NULL); > + > + node_v = node->function_version (); > + gcc_assert (node_v != NULL); > + > + if (node_v->dispatcher_resolver != NULL) > + return node_v->dispatcher_resolver; > + > + /* Find the default version and make it the first node. */ > + first_v = node_v; > + /* Go to the beginning of the chain. */ > + while (first_v->prev != NULL) > + first_v = first_v->prev; > + default_version_info = first_v; > + > + while (default_version_info != NULL) > + { > + struct riscv_feature_bits res; > + int priority; /* Unused. */ > + parse_features_for_version (default_version_info->this_node->decl, > + res, priority); > + if (res.length == 0) > + break; > + default_version_info = default_version_info->next; > + } > + > + /* If there is no default node, just return NULL. */ > + if (default_version_info == NULL) > + return NULL; > + > + /* Make default info the first node. */ > + if (first_v != default_version_info) > + { > + default_version_info->prev->next = default_version_info->next; > + if (default_version_info->next) > + default_version_info->next->prev = default_version_info->prev; > + first_v->prev = default_version_info; > + default_version_info->next = first_v; > + default_version_info->prev = NULL; > + } > + > + default_node = default_version_info->this_node; > + > + if (targetm.has_ifunc_p ()) > + { > + struct cgraph_function_version_info *it_v = NULL; > + struct cgraph_node *dispatcher_node = NULL; > + struct cgraph_function_version_info *dispatcher_version_info = NULL; > + > + /* Right now, the dispatching is done via ifunc. */ > + dispatch_decl = make_dispatcher_decl (default_node->decl); > + TREE_NOTHROW (dispatch_decl) = TREE_NOTHROW (fn); > + > + dispatcher_node = cgraph_node::get_create (dispatch_decl); > + gcc_assert (dispatcher_node != NULL); > + dispatcher_node->dispatcher_function = 1; > + dispatcher_version_info > + = dispatcher_node->insert_new_function_version (); > + dispatcher_version_info->next = default_version_info; > + dispatcher_node->definition = 1; > + > + /* Set the dispatcher for all the versions. */ > + it_v = default_version_info; > + while (it_v != NULL) > + { > + it_v->dispatcher_resolver = dispatch_decl; > + it_v = it_v->next; > + } > + } > + else > + { > + error_at (DECL_SOURCE_LOCATION (default_node->decl), > + "multiversioning needs %<ifunc%> which is not supported " > + "on this target"); > + } > + > + return dispatch_decl; > +} > + > /* On riscv we have an ABI defined safe buffer. This constant is used to > determining the probe offset for alloca. */ > > @@ -12600,6 +13336,10 @@ riscv_stack_clash_protection_alloca_probe_range (void) > #undef TARGET_OPTION_VALID_ATTRIBUTE_P > #define TARGET_OPTION_VALID_ATTRIBUTE_P riscv_option_valid_attribute_p > > +#undef TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P > +#define TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P \ > + riscv_option_valid_version_attribute_p > + > #undef TARGET_LEGITIMIZE_ADDRESS > #define TARGET_LEGITIMIZE_ADDRESS riscv_legitimize_address > > @@ -12948,6 +13688,23 @@ riscv_stack_clash_protection_alloca_probe_range (void) > #undef TARGET_C_MODE_FOR_FLOATING_TYPE > #define TARGET_C_MODE_FOR_FLOATING_TYPE riscv_c_mode_for_floating_type > > +#undef TARGET_OPTION_FUNCTION_VERSIONS > +#define TARGET_OPTION_FUNCTION_VERSIONS riscv_common_function_versions > + > +#undef TARGET_COMPARE_VERSION_PRIORITY > +#define TARGET_COMPARE_VERSION_PRIORITY riscv_compare_version_priority > + > +#undef TARGET_GENERATE_VERSION_DISPATCHER_BODY > +#define TARGET_GENERATE_VERSION_DISPATCHER_BODY \ > + riscv_generate_version_dispatcher_body > + > +#undef TARGET_GET_FUNCTION_VERSIONS_DISPATCHER > +#define TARGET_GET_FUNCTION_VERSIONS_DISPATCHER \ > + riscv_get_function_versions_dispatcher > + > +#undef TARGET_MANGLE_DECL_ASSEMBLER_NAME > +#define TARGET_MANGLE_DECL_ASSEMBLER_NAME riscv_mangle_decl_assembler_name > + > struct gcc_target targetm = TARGET_INITIALIZER; > > #include "gt-riscv.h" > diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h > index ca1b8329cdc..8a8b08b6b51 100644 > --- a/gcc/config/riscv/riscv.h > +++ b/gcc/config/riscv/riscv.h > @@ -1298,4 +1298,11 @@ extern void riscv_remove_unneeded_save_restore_calls (void); > STACK_BOUNDARY / BITS_PER_UNIT) \ > : (crtl->outgoing_args_size + STACK_POINTER_OFFSET)) > > +/* According to the RISC-V C API, the arch string may contains ','. To avoid > + the conflict with the default separator, we choose '#' as the separator for > + the target attribute. */ > +#define TARGET_CLONES_ATTR_SEPARATOR '#' > + > +#define TARGET_HAS_FMV_TARGET_ATTRIBUTE 0 > + > #endif /* ! GCC_RISCV_H */ > diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt > index 6360ed3984d..61def798ca0 100644 > --- a/gcc/config/riscv/riscv.opt > +++ b/gcc/config/riscv/riscv.opt > @@ -523,6 +523,9 @@ Mask(XSFVCP) Var(riscv_sifive_subext) > > Mask(XSFCEASE) Var(riscv_sifive_subext) > > +TargetVariable > +int riscv_fmv_priority = 0 > + > Enum > Name(isa_spec_class) Type(enum riscv_isa_spec_class) > Supported ISA specs (for use with the -misa-spec= option): > diff --git a/gcc/defaults.h b/gcc/defaults.h > index ac2d25852ab..918e3ec2f24 100644 > --- a/gcc/defaults.h > +++ b/gcc/defaults.h > @@ -874,6 +874,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > #define TARGET_HAS_FMV_TARGET_ATTRIBUTE 1 > #endif > > +/* Select a attribute separator for function multiversioning. */ > +#ifndef TARGET_CLONES_ATTR_SEPARATOR > +#define TARGET_CLONES_ATTR_SEPARATOR ',' > +#endif > > /* Select a format to encode pointers in exception handling data. We > prefer those that result in fewer dynamic relocations. Assume no > diff --git a/gcc/multiple_target.cc b/gcc/multiple_target.cc > index 1fdd279da04..c1e358dfc1e 100644 > --- a/gcc/multiple_target.cc > +++ b/gcc/multiple_target.cc > @@ -180,7 +180,7 @@ create_dispatcher_calls (struct cgraph_node *node) > } > } > > -/* Create string with attributes separated by comma. > +/* Create string with attributes separated by TARGET_CLONES_ATTR_SEPARATOR. > Return number of attributes. */ > > static int > @@ -194,17 +194,21 @@ get_attr_str (tree arglist, char *attr_str) > { > const char *str = TREE_STRING_POINTER (TREE_VALUE (arg)); > size_t len = strlen (str); > - for (const char *p = strchr (str, ','); p; p = strchr (p + 1, ',')) > + for (const char *p = strchr (str, TARGET_CLONES_ATTR_SEPARATOR); > + p; > + p = strchr (p + 1, TARGET_CLONES_ATTR_SEPARATOR)) > argnum++; > memcpy (attr_str + str_len_sum, str, len); > - attr_str[str_len_sum + len] = TREE_CHAIN (arg) ? ',' : '\0'; > + attr_str[str_len_sum + len] > + = TREE_CHAIN (arg) ? TARGET_CLONES_ATTR_SEPARATOR : '\0'; > str_len_sum += len + 1; > argnum++; > } > return argnum; > } > > -/* Return number of attributes separated by comma and put them into ARGS. > +/* Return number of attributes separated by TARGET_CLONES_ATTR_SEPARATOR > + and put them into ARGS. > If there is no DEFAULT attribute return -1. > If there is an empty string in attribute return -2. > If there are multiple DEFAULT attributes return -3. > @@ -215,9 +219,10 @@ separate_attrs (char *attr_str, char **attrs, int attrnum) > { > int i = 0; > int default_count = 0; > + static const char separator_str[] = { TARGET_CLONES_ATTR_SEPARATOR, 0 }; > > - for (char *attr = strtok (attr_str, ","); > - attr != NULL; attr = strtok (NULL, ",")) > + for (char *attr = strtok (attr_str, separator_str); > + attr != NULL; attr = strtok (NULL, separator_str)) > { > if (strcmp (attr, "default") == 0) > { > @@ -305,7 +310,7 @@ static bool > expand_target_clones (struct cgraph_node *node, bool definition) > { > int i; > - /* Parsing target attributes separated by comma. */ > + /* Parsing target attributes separated by TARGET_CLONES_ATTR_SEPARATOR. */ > tree attr_target = lookup_attribute ("target_clones", > DECL_ATTRIBUTES (node->decl)); > /* No targets specified. */ > -- > 2.45.2 >
> On Oct 21, 2024, at 10:41, Kito Cheng <kito.cheng@gmail.com> wrote: > > Could you add testcases? Also, could you splitted that into smaller > patches to make it easier to review? > Done! Link: https://patchwork.sourceware.org/project/gcc/list/?series=39772
ack, let you know I still remember this, but I just attending LLVM dev and RISC-V summit this week, will review soon once I get back, and do you mind letting me approve and commit few refactor/NFC patches first? On Mon, Oct 21, 2024 at 11:57 AM Yangyu Chen <cyy@cyyself.name> wrote: > > > > On Oct 21, 2024, at 10:41, Kito Cheng <kito.cheng@gmail.com> wrote: > > > > Could you add testcases? Also, could you splitted that into smaller > > patches to make it easier to review? > > > > Done! > > Link: https://patchwork.sourceware.org/project/gcc/list/?series=39772 >
> On Oct 24, 2024, at 14:53, Kito Cheng <kito.cheng@gmail.com> wrote: > > ack, let you know I still remember this, but I just attending LLVM dev > and RISC-V summit this week, will review soon once I get back, and do > you mind letting me approve and commit few refactor/NFC patches first? > Sure. I’ve also been testing these patches recently since one of my research works relies on this. FYI, there is a v3 version now. Link: https://patchwork.sourceware.org/project/gcc/list/?series=39863&state=* Thanks, Yangyu Chen > On Mon, Oct 21, 2024 at 11:57 AM Yangyu Chen <cyy@cyyself.name> wrote: >> >> >>> On Oct 21, 2024, at 10:41, Kito Cheng <kito.cheng@gmail.com> wrote: >>> >>> Could you add testcases? Also, could you splitted that into smaller >>> patches to make it easier to review? >>> >> >> Done! >> >> Link: https://patchwork.sourceware.org/project/gcc/list/?series=39772
diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc index 2adebe0b6f2..2fc21424ebe 100644 --- a/gcc/common/config/riscv/riscv-common.cc +++ b/gcc/common/config/riscv/riscv-common.cc @@ -19,6 +19,7 @@ along with GCC; see the file COPYING3. If not see #include <sstream> #include <vector> +#include <queue> #define INCLUDE_STRING #define INCLUDE_SET @@ -1760,6 +1761,75 @@ static const riscv_ext_flag_table_t riscv_ext_flag_table[] = {NULL, NULL, NULL, 0} }; +/* Types for recording extension to RISC-V C-API bitmask. */ +struct riscv_ext_bitmask_table_t { + const char *ext; + int groupid; + int bit_position; +}; + +/* Mapping table between extension to RISC-V C-API extension bitmask. + This table should sort the extension by Linux hwprobe order to get the + minimal feature bits. */ +static const riscv_ext_bitmask_table_t riscv_ext_bitmask_table[] = +{ + {"i", 0, 8}, + {"m", 0, 12}, + {"a", 0, 0}, + {"f", 0, 5}, + {"d", 0, 3}, + {"c", 0, 2}, + {"v", 0, 21}, + {"zba", 0, 27}, + {"zbb", 0, 28}, + {"zbs", 0, 33}, + {"zicboz", 0, 37}, + {"zbc", 0, 29}, + {"zbkb", 0, 30}, + {"zbkc", 0, 31}, + {"zbkx", 0, 32}, + {"zknd", 0, 41}, + {"zkne", 0, 42}, + {"zknh", 0, 43}, + {"zksed", 0, 44}, + {"zksh", 0, 45}, + {"zkt", 0, 46}, + {"zvbb", 0, 48}, + {"zvbc", 0, 49}, + {"zvkb", 0, 52}, + {"zvkg", 0, 53}, + {"zvkned", 0, 54}, + {"zvknha", 0, 55}, + {"zvknhb", 0, 56}, + {"zvksed", 0, 57}, + {"zvksh", 0, 58}, + {"zvkt", 0, 59}, + {"zfh", 0, 35}, + {"zfhmin", 0, 36}, + {"zihintntl", 0, 39}, + {"zvfh", 0, 50}, + {"zvfhmin", 0, 51}, + {"zfa", 0, 34}, + {"ztso", 0, 47}, + {"zacas", 0, 26}, + {"zicond", 0, 38}, + {"zihintpause", 0, 40}, + {"zve32x", 0, 60}, + {"zve32f", 0, 61}, + {"zve64x", 0, 62}, + {"zve64f", 0, 63}, + {"zve64d", 1, 0}, + {"zimop", 1, 1}, + {"zca", 1, 2}, + {"zcb", 1, 3}, + {"zcd", 1, 4}, + {"zcf", 1, 5}, + {"zcmop", 1, 6}, + {"zawrs", 1, 7}, + + {NULL, -1, -1} +}; + /* Apply SUBSET_LIST to OPTS if OPTS is not null. */ void @@ -1826,6 +1896,81 @@ riscv_x_target_flags_isa_mask (void) return mask; } +/* Get the minimal feature bits in Linux hwprobe of the given ISA string. + + Used for generating Function Multi-Versioning (FMV) dispatcher for RISC-V. + + The minimal feature bits refer to using the earliest extension that appeared + in the Linux hwprobe to support the specified ISA string. This ensures that + older kernels, which may lack certain implied extensions, can still run the + FMV dispatcher correctly. */ + +bool +riscv_minimal_hwprobe_feature_bits (const char *isa, + struct riscv_feature_bits *res, + location_t loc) +{ + riscv_subset_list *subset_list; + subset_list = riscv_subset_list::parse (isa, loc); + if (!subset_list) + return false; + + /* Initialize the result feature bits to zero. */ + res->length = RISCV_FEATURE_BITS_LENGTH; + for (int i = 0; i < RISCV_FEATURE_BITS_LENGTH; ++i) + res->features[i] = 0; + + /* Use a std::set to record all visited implied extensions. */ + std::set <std::string> implied_exts; + + /* Iterate through the extension bitmask table in Linux hwprobe order to get + the minimal covered feature bits. Avoiding some sub-extensions which will + be implied by the super-extensions like V implied Zve32x. */ + const riscv_ext_bitmask_table_t *ext_bitmask_tab; + for (ext_bitmask_tab = &riscv_ext_bitmask_table[0]; + ext_bitmask_tab->ext; + ++ext_bitmask_tab) + { + /* Skip the extension if it is not in the subset list or already implied + by previous extension. */ + if (subset_list->lookup (ext_bitmask_tab->ext) == NULL + || implied_exts.count (ext_bitmask_tab->ext)) + continue; + + res->features[ext_bitmask_tab->groupid] + |= 1ULL << ext_bitmask_tab->bit_position; + + /* Find the sub-extension using BFS and set the corresponding bit. */ + std::queue <const char *> search_q; + search_q.push (ext_bitmask_tab->ext); + + while (!search_q.empty ()) + { + const char * search_ext = search_q.front (); + search_q.pop (); + + /* Iterate through the implied extension table. */ + const riscv_implied_info_t *implied_info; + for (implied_info = &riscv_implied_info[0]; + implied_info->ext; + ++implied_info) + { + /* When the search extension matches the implied extension and + the implied extension has not been visited, mark the implied + extension in the implied_exts set and push it into the + queue. */ + if (implied_info->match (subset_list, search_ext) + && implied_exts.count (implied_info->implied_ext) == 0) + { + implied_exts.insert (implied_info->implied_ext); + search_q.push (implied_info->implied_ext); + } + } + } + } + return true; +} + /* Parse a RISC-V ISA string into an option mask. Must clear or set all arch dependent mask bits, in case more than one -march string is passed. */ diff --git a/gcc/config/riscv/feature_bits.h b/gcc/config/riscv/feature_bits.h new file mode 100644 index 00000000000..19b7630e339 --- /dev/null +++ b/gcc/config/riscv/feature_bits.h @@ -0,0 +1,44 @@ +/* Definition of RISC-V feature bits corresponding to + libgcc/config/riscv/feature_bits.c + Copyright (C) 2024 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#ifndef GCC_RISCV_FEATURE_BITS_H +#define GCC_RISCV_FEATURE_BITS_H + +#define RISCV_FEATURE_BITS_LENGTH 2 + +struct riscv_feature_bits { + unsigned length; + unsigned long long features[RISCV_FEATURE_BITS_LENGTH]; +}; + +#define RISCV_VENDOR_FEATURE_BITS_LENGTH 1 + +struct riscv_vendor_feature_bits { + unsigned length; + unsigned long long features[RISCV_VENDOR_FEATURE_BITS_LENGTH]; +}; + +struct riscv_cpu_model { + unsigned mvendorid; + unsigned long long marchid; + unsigned long long mimpid; +}; + +#endif /* GCC_RISCV_FEATURE_BITS_H */ diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index d690162bb0c..a35316f1228 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -799,6 +799,10 @@ extern bool riscv_use_divmod_expander (void); void riscv_init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree, int); extern bool riscv_option_valid_attribute_p (tree, tree, tree, int); +extern bool +riscv_option_valid_version_attribute_p (tree, tree, tree, int); +extern bool +riscv_process_target_attr (const char *, location_t); extern void riscv_override_options_internal (struct gcc_options *); extern void riscv_option_override (void); diff --git a/gcc/config/riscv/riscv-subset.h b/gcc/config/riscv/riscv-subset.h index 1914a5317d7..75008be6613 100644 --- a/gcc/config/riscv/riscv-subset.h +++ b/gcc/config/riscv/riscv-subset.h @@ -22,6 +22,8 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_RISCV_SUBSET_H #define GCC_RISCV_SUBSET_H +#include "feature_bits.h" + #define RISCV_DONT_CARE_VERSION -1 /* Subset info. */ @@ -120,6 +122,9 @@ public: extern const riscv_subset_list *riscv_cmdline_subset_list (void); extern void riscv_set_arch_by_subset_list (riscv_subset_list *, struct gcc_options *); +extern bool riscv_minimal_hwprobe_feature_bits (const char *, + struct riscv_feature_bits *, + location_t); extern bool riscv_ext_is_subset (struct cl_target_option *, struct cl_target_option *); extern int riscv_x_target_flags_isa_mask (void); diff --git a/gcc/config/riscv/riscv-target-attr.cc b/gcc/config/riscv/riscv-target-attr.cc index bf14ade5ce0..6a28281b0c9 100644 --- a/gcc/config/riscv/riscv-target-attr.cc +++ b/gcc/config/riscv/riscv-target-attr.cc @@ -30,6 +30,8 @@ along with GCC; see the file COPYING3. If not see #include "diagnostic.h" #include "opts.h" #include "riscv-subset.h" +#include "stringpool.h" +#include "attribs.h" namespace { class riscv_target_attr_parser @@ -39,6 +41,7 @@ public: : m_found_arch_p (false) , m_found_tune_p (false) , m_found_cpu_p (false) + , m_found_priority_p (false) , m_subset_list (nullptr) , m_loc (loc) , m_cpu_info (nullptr) @@ -49,6 +52,7 @@ public: bool handle_arch (const char *); bool handle_cpu (const char *); bool handle_tune (const char *); + bool handle_priority (const char *); void update_settings (struct gcc_options *opts) const; private: @@ -58,10 +62,12 @@ private: bool m_found_arch_p; bool m_found_tune_p; bool m_found_cpu_p; + bool m_found_priority_p; riscv_subset_list *m_subset_list; location_t m_loc; const riscv_cpu_info *m_cpu_info; const char *m_tune; + int m_priority; }; } @@ -80,7 +86,8 @@ struct riscv_attribute_info static const struct riscv_attribute_info riscv_attributes[] = {{"arch", &riscv_target_attr_parser::handle_arch}, {"cpu", &riscv_target_attr_parser::handle_cpu}, - {"tune", &riscv_target_attr_parser::handle_tune}}; + {"tune", &riscv_target_attr_parser::handle_tune}, + {"priority", &riscv_target_attr_parser::handle_priority}}; bool riscv_target_attr_parser::parse_arch (const char *str) @@ -210,6 +217,22 @@ riscv_target_attr_parser::handle_tune (const char *str) return true; } +bool +riscv_target_attr_parser::handle_priority (const char *str) +{ + if (m_found_priority_p) + error_at (m_loc, "%<target()%> attribute: priority appears more than once"); + m_found_priority_p = true; + + if (sscanf (str, "%d", &m_priority) != 1) + { + error_at (m_loc, "%<target()%> attribute: invalid priority %qs", str); + return false; + } + + return true; +} + void riscv_target_attr_parser::update_settings (struct gcc_options *opts) const { @@ -217,10 +240,6 @@ riscv_target_attr_parser::update_settings (struct gcc_options *opts) const { std::string local_arch = m_subset_list->to_string (true); const char* local_arch_str = local_arch.c_str (); - struct cl_target_option *default_opts - = TREE_TARGET_OPTION (target_option_default_node); - if (opts->x_riscv_arch_string != default_opts->x_riscv_arch_string) - free (CONST_CAST (void *, (const void *) opts->x_riscv_arch_string)); opts->x_riscv_arch_string = xstrdup (local_arch_str); riscv_set_arch_by_subset_list (m_subset_list, opts); @@ -236,13 +255,16 @@ riscv_target_attr_parser::update_settings (struct gcc_options *opts) const if (m_cpu_info) opts->x_riscv_tune_string = m_cpu_info->tune; } + + if (m_found_priority_p) + opts->x_riscv_fmv_priority = m_priority; } /* Parse ARG_STR which contains the definition of one target attribute. Show appropriate errors if any or return true if the attribute is valid. */ static bool -riscv_process_one_target_attr (char *arg_str, +riscv_process_one_target_attr (const char *arg_str, location_t loc, riscv_target_attr_parser &attr_parser) { @@ -271,6 +293,12 @@ riscv_process_one_target_attr (char *arg_str, arg[0] = '\0'; ++arg; + + /* Skip splitter ';' if it exists. */ + char *splitter = strchr (arg, ';'); + if (splitter) + splitter[0] = '\0'; + for (const auto &attr : riscv_attributes) { /* If the names don't match up, or the user has given an argument @@ -304,35 +332,13 @@ num_occurrences_in_str (char c, char *str) return res; } -/* Parse the tree in ARGS that contains the target attribute information +/* Parse the string ARGS that contains the target attribute information and update the global target options space. */ -static bool -riscv_process_target_attr (tree args, location_t loc) +bool +riscv_process_target_attr (const char *args, location_t loc) { - if (TREE_CODE (args) == TREE_LIST) - { - do - { - tree head = TREE_VALUE (args); - if (head) - { - if (!riscv_process_target_attr (head, loc)) - return false; - } - args = TREE_CHAIN (args); - } while (args); - - return true; - } - - if (TREE_CODE (args) != STRING_CST) - { - error_at (loc, "attribute %<target%> argument not a string"); - return false; - } - - size_t len = strlen (TREE_STRING_POINTER (args)); + size_t len = strlen (args); /* No need to emit warning or error on empty string here, generic code already handle this case. */ @@ -341,9 +347,14 @@ riscv_process_target_attr (tree args, location_t loc) return false; } + if (strcmp (args, "default") == 0) + { + return true; + } + std::unique_ptr<char[]> buf (new char[len+1]); char *str_to_check = buf.get (); - strcpy (str_to_check, TREE_STRING_POINTER (args)); + strcpy (str_to_check, args); /* Used to catch empty spaces between semi-colons i.e. attribute ((target ("attr1;;attr2"))). */ @@ -366,7 +377,7 @@ riscv_process_target_attr (tree args, location_t loc) if (num_attrs != num_semicolons + 1) { error_at (loc, "malformed %<target(\"%s\")%> attribute", - TREE_STRING_POINTER (args)); + args); return false; } @@ -376,6 +387,37 @@ riscv_process_target_attr (tree args, location_t loc) return true; } +/* Parse the tree in ARGS that contains the target attribute information + and update the global target options space. */ + +static bool +riscv_process_target_attr (tree args, location_t loc) +{ + if (TREE_CODE (args) == TREE_LIST) + { + do + { + tree head = TREE_VALUE (args); + if (head) + { + if (!riscv_process_target_attr (head, loc)) + return false; + } + args = TREE_CHAIN (args); + } while (args); + + return true; + } + + if (TREE_CODE (args) != STRING_CST) + { + error_at (loc, "attribute %<target%> argument not a string"); + return false; + } + + return riscv_process_target_attr (TREE_STRING_POINTER (args), loc); +} + /* Implement TARGET_OPTION_VALID_ATTRIBUTE_P. This is used to process attribute ((target ("..."))). Note, that riscv_set_current_function() has not been called before, @@ -412,6 +454,20 @@ riscv_option_valid_attribute_p (tree fndecl, tree, tree args, int) /* Now we can parse the attributes and set &global_options accordingly. */ ret = riscv_process_target_attr (args, loc); + if (ret) + { + tree version_attr = lookup_attribute ("target_version", + DECL_ATTRIBUTES (fndecl)); + if (version_attr != NULL_TREE) + { + // Reapply any target_version attribute after target attribute. + // This should be equivalent to applying the target_version once + // after processing all target attributes. + tree version_args = TREE_VALUE (version_attr); + riscv_process_target_attr (version_args, + DECL_SOURCE_LOCATION (fndecl)); + } + } if (ret) { riscv_override_options_internal (&global_options); @@ -424,3 +480,89 @@ riscv_option_valid_attribute_p (tree fndecl, tree, tree args, int) cl_target_option_restore (&global_options, &global_options_set, &cur_target); return ret; } + +/* Parse the tree in ARGS that contains the target_version attribute + information and update the global target options space. */ + +static bool +riscv_process_target_version_attr (tree args, location_t loc) +{ + if (TREE_CODE (args) == TREE_LIST) + { + if (TREE_CHAIN (args)) + { + error ("attribute %<target_version%> has multiple values"); + return false; + } + args = TREE_VALUE (args); + } + + if (!args || TREE_CODE (args) != STRING_CST) + { + error ("attribute %<target_version%> argument not a string"); + return false; + } + + const char *str = TREE_STRING_POINTER (args); + if (strcmp (str, "default") == 0) + return true; + + riscv_target_attr_parser attr_parser (loc); + if (!riscv_process_one_target_attr (str, loc, attr_parser)) + return false; + + attr_parser.update_settings (&global_options); + return true; +} + + +/* Implement TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P. This is used to + process attribute ((target_version ("..."))). */ + +bool +riscv_option_valid_version_attribute_p (tree fndecl, tree, tree args, int) +{ + struct cl_target_option cur_target; + bool ret; + tree new_target; + tree existing_target = DECL_FUNCTION_SPECIFIC_TARGET (fndecl); + location_t loc = DECL_SOURCE_LOCATION (fndecl); + + /* Save the current target options to restore at the end. */ + cl_target_option_save (&cur_target, &global_options, &global_options_set); + + /* If fndecl already has some target attributes applied to it, unpack + them so that we add this attribute on top of them, rather than + overwriting them. */ + if (existing_target) + { + struct cl_target_option *existing_options + = TREE_TARGET_OPTION (existing_target); + + if (existing_options) + cl_target_option_restore (&global_options, &global_options_set, + existing_options); + } + else + cl_target_option_restore (&global_options, &global_options_set, + TREE_TARGET_OPTION (target_option_current_node)); + + ret = riscv_process_target_attr (args, loc); + + /* Set up any additional state. */ + if (ret) + { + riscv_override_options_internal (&global_options); + new_target = build_target_option_node (&global_options, + &global_options_set); + } + else + new_target = NULL; + + if (fndecl && ret) + DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = new_target; + + cl_target_option_restore (&global_options, &global_options_set, &cur_target); + + return ret; +} diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 3ac40234345..f091d022e43 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -77,6 +77,9 @@ along with GCC; see the file COPYING3. If not see #include "tree-dfa.h" #include "target-globals.h" #include "riscv-v.h" +#include "cgraph.h" +#include "langhooks.h" +#include "gimplify.h" /* This file should be included last. */ #include "target-def.h" @@ -7666,6 +7669,10 @@ riscv_compute_frame_info (void) static bool riscv_can_inline_p (tree caller, tree callee) { + /* Do not inline when callee is versioned but caller is not. */ + if (DECL_FUNCTION_VERSIONED (callee) && ! DECL_FUNCTION_VERSIONED (caller)) + return false; + tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee); tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller); @@ -12574,6 +12581,735 @@ riscv_c_mode_for_floating_type (enum tree_index ti) return default_mode_for_floating_type (ti); } +/* This parses the attribute arguments to target_version in DECL and modifies + the feature mask and priority required to select those targets. */ +static void +parse_features_for_version (tree decl, + struct riscv_feature_bits &res, + int &priority) +{ + tree version_attr = lookup_attribute ("target_version", + DECL_ATTRIBUTES (decl)); + if (version_attr == NULL_TREE) + { + res.length = 0; + priority = 0; + return; + } + + const char *version_string = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE + (version_attr))); + gcc_assert (version_string != NULL); + if (strcmp (version_string, "default") == 0) + { + res.length = 0; + priority = 0; + return; + } + struct cl_target_option cur_target; + cl_target_option_save (&cur_target, &global_options, + &global_options_set); + /* Always set to default option before parsing "arch=+..." */ + struct cl_target_option *default_opts + = TREE_TARGET_OPTION (target_option_default_node); + cl_target_option_restore (&global_options, &global_options_set, + default_opts); + + riscv_process_target_attr (version_string, + DECL_SOURCE_LOCATION (decl)); + + priority = global_options.x_riscv_fmv_priority; + const char *arch_string = global_options.x_riscv_arch_string; + bool parse_res + = riscv_minimal_hwprobe_feature_bits (arch_string, &res, + DECL_SOURCE_LOCATION (decl)); + gcc_assert (parse_res); + + if (arch_string != default_opts->x_riscv_arch_string) + free (CONST_CAST (void *, (const void *) arch_string)); + + cl_target_option_restore (&global_options, &global_options_set, + &cur_target); +} + +/* Compare priorities of two feature masks. Return: + 1: mask1 is higher priority + -1: mask2 is higher priority + 0: masks are equal. + Since riscv_feature_bits has total 128 bits to be used as mask, when counting + the total 1s in the mask, the 1s in group1 needs to multiply a weight. */ +static int +compare_fmv_features (const struct riscv_feature_bits &mask1, + const struct riscv_feature_bits &mask2, + int prio1, int prio2) +{ + unsigned length1 = mask1.length, length2 = mask2.length; + /* 1. Compare length, for length == 0 means default version, which should be + the lowest priority). */ + if (length1 != length2) + return length1 > length2 ? 1 : -1; + /* 2. Compare the priority. */ + if (prio1 != prio2) + return prio1 > prio2 ? 1 : -1; + /* 3. Compare the total number of 1s in the mask. */ + unsigned pop1 = 0, pop2 = 0; + for (int i = 0; i < length1; i++) + { + pop1 += __builtin_popcountll (mask1.features[i]); + pop2 += __builtin_popcountll (mask2.features[i]); + } + if (pop1 != pop2) + return pop1 > pop2 ? 1 : -1; + /* 4. Compare the mask bit by bit order. */ + for (int i = 0; i < length1; i++) + { + unsigned long long xor_mask = mask1.features[i] ^ mask2.features[i]; + if (xor_mask == 0) + continue; + return TEST_BIT (mask1.features[i], __builtin_ctzll (xor_mask)) ? 1 : -1; + } + /* 5. If all bits are equal, return 0. */ + return 0; +} + +/* Compare priorities of two version decls. Return: + 1: mask1 is higher priority + -1: mask2 is higher priority + 0: masks are equal. */ +int +riscv_compare_version_priority (tree decl1, tree decl2) +{ + struct riscv_feature_bits mask1, mask2; + int prio1, prio2; + + parse_features_for_version (decl1, mask1, prio1); + parse_features_for_version (decl2, mask2, prio2); + + return compare_fmv_features (mask1, mask2, prio1, prio2); +} + + +/* This function returns true if FN1 and FN2 are versions of the same function, + that is, the target_version attributes of the function decls are different. + This assumes that FN1 and FN2 have the same signature. */ + +bool +riscv_common_function_versions (tree fn1, tree fn2) +{ + if (TREE_CODE (fn1) != FUNCTION_DECL + || TREE_CODE (fn2) != FUNCTION_DECL) + return false; + + return riscv_compare_version_priority (fn1, fn2) != 0; +} + +/* This adds a condition to the basic_block NEW_BB in function FUNCTION_DECL + to return a pointer to VERSION_DECL if all feature bits specified in + FEATURE_MASK are not set in MASK_VAR. This function will be called during + version dispatch to decide which function version to execute. It returns + the basic block at the end, to which more conditions can be added. */ +static basic_block +add_condition_to_bb (tree function_decl, tree version_decl, + const struct riscv_feature_bits *features, + tree mask_var, basic_block new_bb) +{ + gimple *return_stmt; + tree convert_expr, result_var; + gimple *convert_stmt; + gimple *if_else_stmt; + + basic_block bb1, bb2, bb3; + edge e12, e23; + + gimple_seq gseq; + + push_cfun (DECL_STRUCT_FUNCTION (function_decl)); + + gcc_assert (new_bb != NULL); + gseq = bb_seq (new_bb); + + convert_expr = build1 (CONVERT_EXPR, ptr_type_node, + build_fold_addr_expr (version_decl)); + result_var = create_tmp_var (ptr_type_node); + convert_stmt = gimple_build_assign (result_var, convert_expr); + return_stmt = gimple_build_return (result_var); + + if (features->length == 0) + { + /* Default version. */ + gimple_seq_add_stmt (&gseq, convert_stmt); + gimple_seq_add_stmt (&gseq, return_stmt); + set_bb_seq (new_bb, gseq); + gimple_set_bb (convert_stmt, new_bb); + gimple_set_bb (return_stmt, new_bb); + pop_cfun (); + return new_bb; + } + + tree zero_llu = build_int_cst (long_long_unsigned_type_node, 0); + tree cond_status = create_tmp_var (boolean_type_node); + tree mask_array_ele_var = create_tmp_var (long_long_unsigned_type_node); + tree and_expr_var = create_tmp_var (long_long_unsigned_type_node); + tree eq_expr_var = create_tmp_var (boolean_type_node); + + /* cond_status = true. */ + gimple *cond_init_stmt = gimple_build_assign (cond_status, boolean_true_node); + gimple_set_block (cond_init_stmt, DECL_INITIAL (function_decl)); + gimple_set_bb (cond_init_stmt, new_bb); + gimple_seq_add_stmt (&gseq, cond_init_stmt); + + for (int i = 0; i < RISCV_FEATURE_BITS_LENGTH; i++) + { + tree index_expr = build_int_cst (unsigned_type_node, i); + /* mask_array_ele_var = mask_var[i] */ + tree mask_array_ref = build4 (ARRAY_REF, long_long_unsigned_type_node, + mask_var, index_expr, NULL_TREE, NULL_TREE); + + gimple *mask_stmt = gimple_build_assign (mask_array_ele_var, + mask_array_ref); + gimple_set_block (mask_stmt, DECL_INITIAL (function_decl)); + gimple_set_bb (mask_stmt, new_bb); + gimple_seq_add_stmt (&gseq, mask_stmt); + /* and_expr_var = mask_array_ele_var & features[i] */ + tree and_expr = build2 (BIT_AND_EXPR, + long_long_unsigned_type_node, + mask_array_ele_var, + build_int_cst (long_long_unsigned_type_node, + features->features[i])); + gimple *and_stmt = gimple_build_assign (and_expr_var, and_expr); + gimple_set_block (and_stmt, DECL_INITIAL (function_decl)); + gimple_set_bb (and_stmt, new_bb); + gimple_seq_add_stmt (&gseq, and_stmt); + /* eq_expr_var = and_expr_var == 0. */ + tree eq_expr = build2 (EQ_EXPR, boolean_type_node, + and_expr_var, zero_llu); + gimple *eq_stmt = gimple_build_assign (eq_expr_var, eq_expr); + gimple_set_block (eq_stmt, DECL_INITIAL (function_decl)); + gimple_set_bb (eq_stmt, new_bb); + gimple_seq_add_stmt (&gseq, eq_stmt); + /* cond_status = cond_status & eq_expr_var. */ + tree cond_expr = build2 (BIT_AND_EXPR, boolean_type_node, + cond_status, eq_expr_var); + gimple *cond_stmt = gimple_build_assign (cond_status, cond_expr); + gimple_set_block (cond_stmt, DECL_INITIAL (function_decl)); + gimple_set_bb (cond_stmt, new_bb); + gimple_seq_add_stmt (&gseq, cond_stmt); + } + if_else_stmt = gimple_build_cond (EQ_EXPR, cond_status, boolean_true_node, + NULL_TREE, NULL_TREE); + gimple_set_block (if_else_stmt, DECL_INITIAL (function_decl)); + gimple_set_bb (if_else_stmt, new_bb); + gimple_seq_add_stmt (&gseq, if_else_stmt); + + gimple_seq_add_stmt (&gseq, convert_stmt); + gimple_seq_add_stmt (&gseq, return_stmt); + set_bb_seq (new_bb, gseq); + + bb1 = new_bb; + e12 = split_block (bb1, if_else_stmt); + bb2 = e12->dest; + e12->flags &= ~EDGE_FALLTHRU; + e12->flags |= EDGE_TRUE_VALUE; + + e23 = split_block (bb2, return_stmt); + + gimple_set_bb (convert_stmt, bb2); + gimple_set_bb (return_stmt, bb2); + + bb3 = e23->dest; + make_edge (bb1, bb3, EDGE_FALSE_VALUE); + + remove_edge (e23); + make_edge (bb2, EXIT_BLOCK_PTR_FOR_FN (cfun), 0); + + pop_cfun (); + + return bb3; +} + +/* This function generates the dispatch function for + multi-versioned functions. DISPATCH_DECL is the function which will + contain the dispatch logic. FNDECLS are the function choices for + dispatch, and is a tree chain. EMPTY_BB is the basic block pointer + in DISPATCH_DECL in which the dispatch code is generated. */ + +static int +dispatch_function_versions (tree dispatch_decl, + void *fndecls_p, + basic_block *empty_bb) +{ + gimple *ifunc_cpu_init_stmt; + gimple_seq gseq; + vec<tree> *fndecls; + + gcc_assert (dispatch_decl != NULL + && fndecls_p != NULL + && empty_bb != NULL); + + push_cfun (DECL_STRUCT_FUNCTION (dispatch_decl)); + + gseq = bb_seq (*empty_bb); + /* Function version dispatch is via IFUNC. IFUNC resolvers fire before + constructors, so explicity call __init_riscv_feature_bits here. */ + tree init_fn_type = build_function_type_list (void_type_node, + long_unsigned_type_node, + ptr_type_node, + NULL); + tree init_fn_id = get_identifier ("__init_riscv_feature_bits"); + tree init_fn_decl = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL, + init_fn_id, init_fn_type); + ifunc_cpu_init_stmt = gimple_build_call (init_fn_decl, 0); + gimple_seq_add_stmt (&gseq, ifunc_cpu_init_stmt); + gimple_set_bb (ifunc_cpu_init_stmt, *empty_bb); + + /* Build the struct type for __riscv_feature_bits. */ + tree global_type = lang_hooks.types.make_type (RECORD_TYPE); + tree features_type = build_array_type_nelts (long_long_unsigned_type_node, + RISCV_FEATURE_BITS_LENGTH); + tree field1 = build_decl (UNKNOWN_LOCATION, FIELD_DECL, + get_identifier ("length"), + unsigned_type_node); + tree field2 = build_decl (UNKNOWN_LOCATION, FIELD_DECL, + get_identifier ("features"), + features_type); + DECL_FIELD_CONTEXT (field1) = global_type; + DECL_FIELD_CONTEXT (field2) = global_type; + TYPE_FIELDS (global_type) = field1; + DECL_CHAIN (field1) = field2; + layout_type (global_type); + + tree global_var = build_decl (UNKNOWN_LOCATION, VAR_DECL, + get_identifier ("__riscv_feature_bits"), + global_type); + DECL_EXTERNAL (global_var) = 1; + tree mask_var = create_tmp_var (features_type); + tree feature_ele_var = create_tmp_var (long_long_unsigned_type_node); + tree noted_var = create_tmp_var (long_long_unsigned_type_node); + + + for (int i = 0; i < RISCV_FEATURE_BITS_LENGTH; i++) + { + tree index_expr = build_int_cst (unsigned_type_node, i); + /* feature_ele_var = __riscv_feature_bits.features[i] */ + tree component_expr = build3 (COMPONENT_REF, features_type, + global_var, field2, NULL_TREE); + tree feature_array_ref = build4 (ARRAY_REF, long_long_unsigned_type_node, + component_expr, index_expr, + NULL_TREE, NULL_TREE); + gimple *feature_stmt = gimple_build_assign (feature_ele_var, + feature_array_ref); + gimple_set_block (feature_stmt, DECL_INITIAL (dispatch_decl)); + gimple_set_bb (feature_stmt, *empty_bb); + gimple_seq_add_stmt (&gseq, feature_stmt); + /* noted_var = ~feature_ele_var. */ + tree not_expr = build1 (BIT_NOT_EXPR, long_long_unsigned_type_node, + feature_ele_var); + gimple *not_stmt = gimple_build_assign (noted_var, not_expr); + gimple_set_block (not_stmt, DECL_INITIAL (dispatch_decl)); + gimple_set_bb (not_stmt, *empty_bb); + gimple_seq_add_stmt (&gseq, not_stmt); + /* mask_var[i] = ~feature_ele_var. */ + tree mask_array_ref = build4 (ARRAY_REF, long_long_unsigned_type_node, + mask_var, index_expr, NULL_TREE, NULL_TREE); + gimple *mask_assign_stmt = gimple_build_assign (mask_array_ref, + noted_var); + gimple_set_block (mask_assign_stmt, DECL_INITIAL (dispatch_decl)); + gimple_set_bb (mask_assign_stmt, *empty_bb); + gimple_seq_add_stmt (&gseq, mask_assign_stmt); + } + + set_bb_seq (*empty_bb, gseq); + + pop_cfun (); + + /* fndecls_p is actually a vector. */ + fndecls = static_cast<vec<tree> *> (fndecls_p); + + /* At least one more version other than the default. */ + unsigned int num_versions = fndecls->length (); + gcc_assert (num_versions >= 2); + + struct function_version_info + { + tree version_decl; + int prio; + struct riscv_feature_bits features; + } *function_versions; + + function_versions = (struct function_version_info *) + XNEWVEC (struct function_version_info, (num_versions)); + + unsigned int actual_versions = 0; + + for (tree version_decl : *fndecls) + { + function_versions[actual_versions].version_decl = version_decl; + // Get attribute string, parse it and find the right features. + parse_features_for_version (version_decl, + function_versions[actual_versions].features, + function_versions[actual_versions].prio); + actual_versions++; + } + + + auto compare_feature_version_info = [](const void *p1, const void *p2) + { + const function_version_info v1 = *(const function_version_info *)p1; + const function_version_info v2 = *(const function_version_info *)p2; + return - compare_fmv_features (v1.features, v2.features, + v1.prio, v2.prio); + }; + + /* Sort the versions according to descending order of dispatch priority. */ + qsort (function_versions, actual_versions, + sizeof (struct function_version_info), compare_feature_version_info); + + for (unsigned int i = 0; i < actual_versions; ++i) + { + *empty_bb = add_condition_to_bb (dispatch_decl, + function_versions[i].version_decl, + &function_versions[i].features, + mask_var, + *empty_bb); + } + + free (function_versions); + return 0; +} + +/* Return an identifier for the base assembler name of a versioned function. + This is computed by taking the default version's assembler name, and + stripping off the ".default" suffix if it's already been appended. */ + +static tree +get_suffixed_assembler_name (tree default_decl, const char *suffix) +{ + std::string name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (default_decl)); + + auto size = name.size (); + if (size >= 8 && name.compare (size - 8, 8, ".default") == 0) + name.resize (size - 8); + name += suffix; + return get_identifier (name.c_str ()); +} + +/* Make the resolver function decl to dispatch the versions of + a multi-versioned function, DEFAULT_DECL. IFUNC_ALIAS_DECL is + ifunc alias that will point to the created resolver. Create an + empty basic block in the resolver and store the pointer in + EMPTY_BB. Return the decl of the resolver function. */ + +static tree +make_resolver_func (const tree default_decl, + const tree ifunc_alias_decl, + basic_block *empty_bb) +{ + tree decl, type, t; + + /* Create resolver function name based on default_decl. We need to remove an + existing ".default" suffix if this has already been appended. */ + tree decl_name = get_suffixed_assembler_name (default_decl, ".resolver"); + const char *resolver_name = IDENTIFIER_POINTER (decl_name); + + /* The resolver function should have signature + (void *) resolver (uint64_t, void *) */ + type = build_function_type_list (ptr_type_node, + uint64_type_node, + ptr_type_node, + NULL_TREE); + + decl = build_fn_decl (resolver_name, type); + SET_DECL_ASSEMBLER_NAME (decl, decl_name); + + DECL_NAME (decl) = decl_name; + TREE_USED (decl) = 1; + DECL_ARTIFICIAL (decl) = 1; + DECL_IGNORED_P (decl) = 1; + TREE_PUBLIC (decl) = 0; + DECL_UNINLINABLE (decl) = 1; + + /* Resolver is not external, body is generated. */ + DECL_EXTERNAL (decl) = 0; + DECL_EXTERNAL (ifunc_alias_decl) = 0; + + DECL_CONTEXT (decl) = NULL_TREE; + DECL_INITIAL (decl) = make_node (BLOCK); + DECL_STATIC_CONSTRUCTOR (decl) = 0; + + if (DECL_COMDAT_GROUP (default_decl) + || TREE_PUBLIC (default_decl)) + { + /* In this case, each translation unit with a call to this + versioned function will put out a resolver. Ensure it + is comdat to keep just one copy. */ + DECL_COMDAT (decl) = 1; + make_decl_one_only (decl, DECL_ASSEMBLER_NAME (decl)); + } + else + TREE_PUBLIC (ifunc_alias_decl) = 0; + + /* Build result decl and add to function_decl. */ + t = build_decl (UNKNOWN_LOCATION, RESULT_DECL, NULL_TREE, ptr_type_node); + DECL_CONTEXT (t) = decl; + DECL_ARTIFICIAL (t) = 1; + DECL_IGNORED_P (t) = 1; + DECL_RESULT (decl) = t; + + /* Build parameter decls and add to function_decl. */ + tree arg1 = build_decl (UNKNOWN_LOCATION, PARM_DECL, + get_identifier ("hwcap"), + uint64_type_node); + tree arg2 = build_decl (UNKNOWN_LOCATION, PARM_DECL, + get_identifier ("hwprobe_func"), + ptr_type_node); + DECL_CONTEXT (arg1) = decl; + DECL_CONTEXT (arg2) = decl; + DECL_ARTIFICIAL (arg1) = 1; + DECL_ARTIFICIAL (arg2) = 1; + DECL_IGNORED_P (arg1) = 1; + DECL_IGNORED_P (arg2) = 1; + DECL_ARG_TYPE (arg1) = uint64_type_node; + DECL_ARG_TYPE (arg2) = ptr_type_node; + DECL_ARGUMENTS (decl) = arg1; + TREE_CHAIN (arg1) = arg2; + + gimplify_function_tree (decl); + push_cfun (DECL_STRUCT_FUNCTION (decl)); + *empty_bb = init_lowered_empty_function (decl, false, + profile_count::uninitialized ()); + + cgraph_node::add_new_function (decl, true); + symtab->call_cgraph_insertion_hooks (cgraph_node::get_create (decl)); + + pop_cfun (); + + gcc_assert (ifunc_alias_decl != NULL); + /* Mark ifunc_alias_decl as "ifunc" with resolver as resolver_name. */ + DECL_ATTRIBUTES (ifunc_alias_decl) + = make_attribute ("ifunc", resolver_name, + DECL_ATTRIBUTES (ifunc_alias_decl)); + + /* Create the alias for dispatch to resolver here. */ + cgraph_node::create_same_body_alias (ifunc_alias_decl, decl); + return decl; +} + +/* Implement TARGET_MANGLE_DECL_ASSEMBLER_NAME, to add function multiversioning + suffixes. */ + +tree +riscv_mangle_decl_assembler_name (tree decl, tree id) +{ + /* For function version, add the target suffix to the assembler name. */ + if (TREE_CODE (decl) == FUNCTION_DECL + && DECL_FUNCTION_VERSIONED (decl)) + { + std::string name = IDENTIFIER_POINTER (id) + std::string ("."); + tree target_attr = lookup_attribute ("target_version", + DECL_ATTRIBUTES (decl)); + + if (target_attr == NULL_TREE) + { + name += "default"; + return get_identifier (name.c_str ()); + } + + const char *version_string = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE + (target_attr))); + + /* Replace non-alphanumeric characters with underscores as the suffix. */ + for (const char *c = version_string; *c; c++) + name += ISALNUM (*c) == 0 ? '_' : *c; + + if (DECL_ASSEMBLER_NAME_SET_P (decl)) + SET_DECL_RTL (decl, NULL); + + id = get_identifier (name.c_str ()); + } + + return id; +} + +/* Implement TARGET_GENERATE_VERSION_DISPATCHER_BODY. */ + +tree +riscv_generate_version_dispatcher_body (void *node_p) +{ + tree resolver_decl; + basic_block empty_bb; + tree default_ver_decl; + struct cgraph_node *versn; + struct cgraph_node *node; + + struct cgraph_function_version_info *node_version_info = NULL; + struct cgraph_function_version_info *versn_info = NULL; + + node = (cgraph_node *)node_p; + + node_version_info = node->function_version (); + gcc_assert (node->dispatcher_function + && node_version_info != NULL); + + if (node_version_info->dispatcher_resolver) + return node_version_info->dispatcher_resolver; + + /* The first version in the chain corresponds to the default version. */ + default_ver_decl = node_version_info->next->this_node->decl; + + /* node is going to be an alias, so remove the finalized bit. */ + node->definition = false; + + resolver_decl = make_resolver_func (default_ver_decl, + node->decl, &empty_bb); + + node_version_info->dispatcher_resolver = resolver_decl; + + push_cfun (DECL_STRUCT_FUNCTION (resolver_decl)); + + auto_vec<tree, 2> fn_ver_vec; + + for (versn_info = node_version_info->next; versn_info; + versn_info = versn_info->next) + { + versn = versn_info->this_node; + /* Check for virtual functions here again, as by this time it should + have been determined if this function needs a vtable index or + not. This happens for methods in derived classes that override + virtual methods in base classes but are not explicitly marked as + virtual. */ + if (DECL_VINDEX (versn->decl)) + sorry ("virtual function multiversioning not supported"); + + fn_ver_vec.safe_push (versn->decl); + } + + dispatch_function_versions (resolver_decl, &fn_ver_vec, &empty_bb); + cgraph_edge::rebuild_edges (); + pop_cfun (); + + /* Fix up symbol names. First we need to obtain the base name, which may + have already been mangled. */ + tree base_name = get_suffixed_assembler_name (default_ver_decl, ""); + + /* We need to redo the version mangling on the non-default versions for the + target_clones case. Redoing the mangling for the target_version case is + redundant but does no harm. We need to skip the default version, because + expand_clones will append ".default" later; fortunately that suffix is the + one we want anyway. */ + for (versn_info = node_version_info->next->next; versn_info; + versn_info = versn_info->next) + { + tree version_decl = versn_info->this_node->decl; + tree name = riscv_mangle_decl_assembler_name (version_decl, + base_name); + symtab->change_decl_assembler_name (version_decl, name); + } + + /* We also need to use the base name for the ifunc declaration. */ + symtab->change_decl_assembler_name (node->decl, base_name); + + return resolver_decl; +} + +/* Make a dispatcher declaration for the multi-versioned function DECL. + Calls to DECL function will be replaced with calls to the dispatcher + by the front-end. Returns the decl of the dispatcher function. */ + +tree +riscv_get_function_versions_dispatcher (void *decl) +{ + tree fn = (tree) decl; + struct cgraph_node *node = NULL; + struct cgraph_node *default_node = NULL; + struct cgraph_function_version_info *node_v = NULL; + struct cgraph_function_version_info *first_v = NULL; + + tree dispatch_decl = NULL; + + struct cgraph_function_version_info *default_version_info = NULL; + + gcc_assert (fn != NULL && DECL_FUNCTION_VERSIONED (fn)); + + node = cgraph_node::get (fn); + gcc_assert (node != NULL); + + node_v = node->function_version (); + gcc_assert (node_v != NULL); + + if (node_v->dispatcher_resolver != NULL) + return node_v->dispatcher_resolver; + + /* Find the default version and make it the first node. */ + first_v = node_v; + /* Go to the beginning of the chain. */ + while (first_v->prev != NULL) + first_v = first_v->prev; + default_version_info = first_v; + + while (default_version_info != NULL) + { + struct riscv_feature_bits res; + int priority; /* Unused. */ + parse_features_for_version (default_version_info->this_node->decl, + res, priority); + if (res.length == 0) + break; + default_version_info = default_version_info->next; + } + + /* If there is no default node, just return NULL. */ + if (default_version_info == NULL) + return NULL; + + /* Make default info the first node. */ + if (first_v != default_version_info) + { + default_version_info->prev->next = default_version_info->next; + if (default_version_info->next) + default_version_info->next->prev = default_version_info->prev; + first_v->prev = default_version_info; + default_version_info->next = first_v; + default_version_info->prev = NULL; + } + + default_node = default_version_info->this_node; + + if (targetm.has_ifunc_p ()) + { + struct cgraph_function_version_info *it_v = NULL; + struct cgraph_node *dispatcher_node = NULL; + struct cgraph_function_version_info *dispatcher_version_info = NULL; + + /* Right now, the dispatching is done via ifunc. */ + dispatch_decl = make_dispatcher_decl (default_node->decl); + TREE_NOTHROW (dispatch_decl) = TREE_NOTHROW (fn); + + dispatcher_node = cgraph_node::get_create (dispatch_decl); + gcc_assert (dispatcher_node != NULL); + dispatcher_node->dispatcher_function = 1; + dispatcher_version_info + = dispatcher_node->insert_new_function_version (); + dispatcher_version_info->next = default_version_info; + dispatcher_node->definition = 1; + + /* Set the dispatcher for all the versions. */ + it_v = default_version_info; + while (it_v != NULL) + { + it_v->dispatcher_resolver = dispatch_decl; + it_v = it_v->next; + } + } + else + { + error_at (DECL_SOURCE_LOCATION (default_node->decl), + "multiversioning needs %<ifunc%> which is not supported " + "on this target"); + } + + return dispatch_decl; +} + /* On riscv we have an ABI defined safe buffer. This constant is used to determining the probe offset for alloca. */ @@ -12600,6 +13336,10 @@ riscv_stack_clash_protection_alloca_probe_range (void) #undef TARGET_OPTION_VALID_ATTRIBUTE_P #define TARGET_OPTION_VALID_ATTRIBUTE_P riscv_option_valid_attribute_p +#undef TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P +#define TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P \ + riscv_option_valid_version_attribute_p + #undef TARGET_LEGITIMIZE_ADDRESS #define TARGET_LEGITIMIZE_ADDRESS riscv_legitimize_address @@ -12948,6 +13688,23 @@ riscv_stack_clash_protection_alloca_probe_range (void) #undef TARGET_C_MODE_FOR_FLOATING_TYPE #define TARGET_C_MODE_FOR_FLOATING_TYPE riscv_c_mode_for_floating_type +#undef TARGET_OPTION_FUNCTION_VERSIONS +#define TARGET_OPTION_FUNCTION_VERSIONS riscv_common_function_versions + +#undef TARGET_COMPARE_VERSION_PRIORITY +#define TARGET_COMPARE_VERSION_PRIORITY riscv_compare_version_priority + +#undef TARGET_GENERATE_VERSION_DISPATCHER_BODY +#define TARGET_GENERATE_VERSION_DISPATCHER_BODY \ + riscv_generate_version_dispatcher_body + +#undef TARGET_GET_FUNCTION_VERSIONS_DISPATCHER +#define TARGET_GET_FUNCTION_VERSIONS_DISPATCHER \ + riscv_get_function_versions_dispatcher + +#undef TARGET_MANGLE_DECL_ASSEMBLER_NAME +#define TARGET_MANGLE_DECL_ASSEMBLER_NAME riscv_mangle_decl_assembler_name + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-riscv.h" diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h index ca1b8329cdc..8a8b08b6b51 100644 --- a/gcc/config/riscv/riscv.h +++ b/gcc/config/riscv/riscv.h @@ -1298,4 +1298,11 @@ extern void riscv_remove_unneeded_save_restore_calls (void); STACK_BOUNDARY / BITS_PER_UNIT) \ : (crtl->outgoing_args_size + STACK_POINTER_OFFSET)) +/* According to the RISC-V C API, the arch string may contains ','. To avoid + the conflict with the default separator, we choose '#' as the separator for + the target attribute. */ +#define TARGET_CLONES_ATTR_SEPARATOR '#' + +#define TARGET_HAS_FMV_TARGET_ATTRIBUTE 0 + #endif /* ! GCC_RISCV_H */ diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt index 6360ed3984d..61def798ca0 100644 --- a/gcc/config/riscv/riscv.opt +++ b/gcc/config/riscv/riscv.opt @@ -523,6 +523,9 @@ Mask(XSFVCP) Var(riscv_sifive_subext) Mask(XSFCEASE) Var(riscv_sifive_subext) +TargetVariable +int riscv_fmv_priority = 0 + Enum Name(isa_spec_class) Type(enum riscv_isa_spec_class) Supported ISA specs (for use with the -misa-spec= option): diff --git a/gcc/defaults.h b/gcc/defaults.h index ac2d25852ab..918e3ec2f24 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -874,6 +874,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define TARGET_HAS_FMV_TARGET_ATTRIBUTE 1 #endif +/* Select a attribute separator for function multiversioning. */ +#ifndef TARGET_CLONES_ATTR_SEPARATOR +#define TARGET_CLONES_ATTR_SEPARATOR ',' +#endif /* Select a format to encode pointers in exception handling data. We prefer those that result in fewer dynamic relocations. Assume no diff --git a/gcc/multiple_target.cc b/gcc/multiple_target.cc index 1fdd279da04..c1e358dfc1e 100644 --- a/gcc/multiple_target.cc +++ b/gcc/multiple_target.cc @@ -180,7 +180,7 @@ create_dispatcher_calls (struct cgraph_node *node) } } -/* Create string with attributes separated by comma. +/* Create string with attributes separated by TARGET_CLONES_ATTR_SEPARATOR. Return number of attributes. */ static int @@ -194,17 +194,21 @@ get_attr_str (tree arglist, char *attr_str) { const char *str = TREE_STRING_POINTER (TREE_VALUE (arg)); size_t len = strlen (str); - for (const char *p = strchr (str, ','); p; p = strchr (p + 1, ',')) + for (const char *p = strchr (str, TARGET_CLONES_ATTR_SEPARATOR); + p; + p = strchr (p + 1, TARGET_CLONES_ATTR_SEPARATOR)) argnum++; memcpy (attr_str + str_len_sum, str, len); - attr_str[str_len_sum + len] = TREE_CHAIN (arg) ? ',' : '\0'; + attr_str[str_len_sum + len] + = TREE_CHAIN (arg) ? TARGET_CLONES_ATTR_SEPARATOR : '\0'; str_len_sum += len + 1; argnum++; } return argnum; } -/* Return number of attributes separated by comma and put them into ARGS. +/* Return number of attributes separated by TARGET_CLONES_ATTR_SEPARATOR + and put them into ARGS. If there is no DEFAULT attribute return -1. If there is an empty string in attribute return -2. If there are multiple DEFAULT attributes return -3. @@ -215,9 +219,10 @@ separate_attrs (char *attr_str, char **attrs, int attrnum) { int i = 0; int default_count = 0; + static const char separator_str[] = { TARGET_CLONES_ATTR_SEPARATOR, 0 }; - for (char *attr = strtok (attr_str, ","); - attr != NULL; attr = strtok (NULL, ",")) + for (char *attr = strtok (attr_str, separator_str); + attr != NULL; attr = strtok (NULL, separator_str)) { if (strcmp (attr, "default") == 0) { @@ -305,7 +310,7 @@ static bool expand_target_clones (struct cgraph_node *node, bool definition) { int i; - /* Parsing target attributes separated by comma. */ + /* Parsing target attributes separated by TARGET_CLONES_ATTR_SEPARATOR. */ tree attr_target = lookup_attribute ("target_clones", DECL_ATTRIBUTES (node->decl)); /* No targets specified. */