Message ID | 87wq66ehly.fsf@gmail.com |
---|---|
State | New |
Headers | show |
On Fri, 2014-12-05 at 01:27 -0500, Ulrich Drepper wrote: > If you generate code with the JIT which references outside symbols there > is currently no way to have a self-contained DSO created. The command > line to invoke the linker is fixed. What's the use-case here? Sorry if I'm not getting at what this is for. If I understand things correctly, an example use-case here is: Executable "foo" is not yet linked against, say libpng.so, and wants to JIT-compile code that uses libpng. Hence: foo -> libgccjit.so JIT-compile code "jitted_function" that calls, say, png_uint_32 png_access_version_number (void); This gives a fake.so with an imported symbol of "png_access_version_number" and a NEEDED of libpng.so.something. Upon attempting to dlopen (fake.so) and run "jitted_function", then presumably one of several things can happen: * libpng.so.N was already loaded into "foo" * libpng.so.N is loaded when fake.so is dlopened * libpng.so.N is not found (currently libgccjit.so is dlopening fake.so with RTLD_NOW | RTLD_LOCAL). Is the "self-containedness of the DSO" in your patch aimed at ensuring that libpng.so.N gets unloaded when fake.so is unloaded? Or is this more about having any errors happen at compilation time, rather than symbol load/run time? > The patch below would change that. It builds upon the existing > framework to specify options for the compiler. The linker optimization > flag fits fully into the existing functionality. For additional files > to link with I had to extend the mechanism a bit since it is not just > one string that needs to be remembered. > > I've also added the set_str_option member function to the C++ interface > of the library. That must have been an oversight. One issue here is the lifetime of str options; currently str options simply record the const char *, without taking a copy of the underlying buffer. We might need to change this to make it take a strdup of the option, to avoid nasty surprises if someone calls set_str_option with a std::string and has it auto-coerced to a const char * from under them. > What do you think? I'm still not clear about the problem you're solving here; sorry. New options should be documented in: gcc/jit/docs/topics/contexts.rst in the "Options" section. and these ones should probably be mentioned in the subsection on GCC_JIT_FUNCTION_IMPORTED in functions.rst. Sadly, the "Tutorial" part of the current docs is missing any kind of discussion of using functions from other DSOs - sorry - which sounds like something I/we should fix, especially if we need to add options for it. I've filed that omission as PR jit/64201. We'd need one or more testcase(s) to exercise the options. Various comments inline: > gcc/ChangeLog: > > 2014-12-05 Ulrich Drepper <drepper@gmail.com> > > * jit/libgccjit++.h (context): Add missing set_str_option > member function. > > * jit/libgccjit.h (gcc_jit_int_option): Add > GCC_JIT_INT_OPTION_LINK_OPTIMIZATION_LEVEL. > (gcc_jit_str_option): Add GCC_JIT_STR_OPTION_LINKFILE. > * jit/jit-playback.c (convert_to_dso): Use auto_vec instead > of fixed-sized array for arguments. Define ADD_ARG macro > to add to it. Adjust existing code. Additionally add > optimization level and additional link files to the list. > * jit/jit-playback.h (context::get_linkfiles): New member > function. > * jit/jit-recording.c (recording::context:set_str_option): > Handle GCC_JIT_STR_OPTION_LINKFILE. > * jit/jit-recording.h (recording::context:set_str_option): > Add get_linkfiles member function. > > diff --git a/gcc/jit/jit-playback.c b/gcc/jit/jit-playback.c > index ecdae80..9c4e45f 100644 > --- a/gcc/jit/jit-playback.c > +++ b/gcc/jit/jit-playback.c > @@ -1726,18 +1726,19 @@ convert_to_dso (const char *ctxt_progname) > TV_ASSEMBLE. */ > auto_timevar assemble_timevar (TV_ASSEMBLE); > const char *errmsg; > - const char *argv[7]; > + auto_vec <const char *> argvec; > +#define ADD_ARG(arg) argvec.safe_push (arg) > int exit_status = 0; > int err = 0; > const char *gcc_driver_name = GCC_DRIVER_NAME; > > - argv[0] = gcc_driver_name; > - argv[1] = "-shared"; > + ADD_ARG (gcc_driver_name); > + ADD_ARG ("-shared"); > /* The input: assembler. */ > - argv[2] = m_path_s_file; > + ADD_ARG (m_path_s_file); > /* The output: shared library. */ > - argv[3] = "-o"; > - argv[4] = m_path_so_file; > + ADD_ARG ("-o"); > + ADD_ARG (m_path_so_file); This conversion from an array to an auto_vec via ADD_ARG looks good to me. > /* Don't use the linker plugin. > If running with just a "make" and not a "make install", then we'd > @@ -1746,17 +1747,39 @@ convert_to_dso (const char *ctxt_progname) > libto_plugin is a .la at build time, with it becoming installed with > ".so" suffix: i.e. it doesn't exist with a .so suffix until install > time. */ > - argv[5] = "-fno-use-linker-plugin"; > + ADD_ARG ("-fno-use-linker-plugin"); > + > + /* Linker int options. */ > + switch (get_int_option (GCC_JIT_INT_OPTION_LINK_OPTIMIZATION_LEVEL)) > + { > + default: > + add_error (NULL, > + "unrecognized linker optimization level: %i", > + get_int_option (GCC_JIT_INT_OPTION_LINK_OPTIMIZATION_LEVEL)); > + return; > + > + case 0: > + break; > + > + case 1: > + ADD_ARG ("-Wl,-O"); > + break; > + } Note to myself: this means that if the client code supplies the option, then the library will invoke the harness so that latter invokes the linker with -O. Note to myself: "man ld" says of this option: -O level If level is a numeric values greater than zero ld optimizes the output. This might take significantly longer and therefore probably should only be enabled for the final binary. At the moment this option only affects ELF shared library generation. Future releases of the linker may make more use of this option. Also currently there is no difference in the linker's behaviour for different non-zero values of this option. Again this may change with future releases. Do you have a sense of what impact setting the option would have on the time taken by gcc_jit_context_compile? > + const char *elt; > + const auto_vec<const char *>& linkfiles = get_linkfiles(); > + for (unsigned ix = 0; linkfiles.iterate(ix, &elt); ++ix) > + ADD_ARG (elt); This doesn't support nested contexts; presumably this should walk up through any parent contexts, adding any linkfiles requested by them? Though given that I don't fully understand your use-case, I'm not sure quite what the correct behavior here should be. [...snip...] > diff --git a/gcc/jit/jit-playback.h b/gcc/jit/jit-playback.h > index 02f08ba..2726347 100644 > --- a/gcc/jit/jit-playback.h > +++ b/gcc/jit/jit-playback.h > @@ -175,6 +175,12 @@ public: > return m_recording_ctxt->get_bool_option (opt); > } > > + const auto_vec <const char *>& > + get_linkfiles () const > + { > + return m_recording_ctxt->get_linkfiles (); > + } Here's another place where nested contexts may need to be supported: a playback context's m_recording_ctxt may have ancestors, and they might have linkfiles specified. This is vaguely analogous to header files (the parent recording contexts) vs the source file (the m_recording_ctxt of the playback context), if that makes any sense. Again, given that I don't fully understand your use-case, I'm not sure quite what the correct behavior here should be. > diff --git a/gcc/jit/jit-recording.c b/gcc/jit/jit-recording.c > index 82ec399..a6d64f9 100644 > --- a/gcc/jit/jit-recording.c > +++ b/gcc/jit/jit-recording.c > @@ -827,7 +827,17 @@ recording::context::set_str_option (enum gcc_jit_str_option opt, > "unrecognized (enum gcc_jit_str_option) value: %i", opt); > return; > } > + > + switch (opt) > + { > + default: > m_str_options[opt] = value; > + break; > + > + case GCC_JIT_STR_OPTION_LINKFILE: > + m_linkfiles.safe_push (value); > + break; > + } > } (As mentioned above, maybe we should change set_str_option so it takes a strdup of the value) I notice that this string option works differently from the others, in that it appends to a list, rather than overwriting a value; that would need spelling out in the documentation. > /* Set the given integer option for this context, or add an error if > diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h > index 31fb304..4b21248 100644 > --- a/gcc/jit/jit-recording.h > +++ b/gcc/jit/jit-recording.h > @@ -209,6 +209,12 @@ public: > return m_bool_options[opt]; > } > > + const auto_vec<const char *>& > + get_linkfiles (void) const > + { > + return m_linkfiles; > + } > + > result * > compile (); > > @@ -249,6 +255,7 @@ private: > const char *m_str_options[GCC_JIT_NUM_STR_OPTIONS]; > int m_int_options[GCC_JIT_NUM_INT_OPTIONS]; > bool m_bool_options[GCC_JIT_NUM_BOOL_OPTIONS]; > + auto_vec <const char *> m_linkfiles; > > /* Recorded API usage. */ > auto_vec<memento *> m_mementos; > @@ -1584,4 +1591,3 @@ private: > } // namespace gcc > > #endif /* JIT_RECORDING_H */ > - > diff --git a/gcc/jit/libgccjit++.h b/gcc/jit/libgccjit++.h > index 67ed5d5..232bb0f 100644 > --- a/gcc/jit/libgccjit++.h > +++ b/gcc/jit/libgccjit++.h > @@ -99,6 +99,9 @@ namespace gccjit > void dump_to_file (const std::string &path, > bool update_locations); > > + void set_str_option (enum gcc_jit_str_option opt, > + const char *value); > + > void set_int_option (enum gcc_jit_int_option opt, > int value); > > @@ -535,6 +538,14 @@ context::dump_to_file (const std::string &path, > } > > inline void > +context::set_str_option (enum gcc_jit_str_option opt, > + const char *value) > +{ > + gcc_jit_context_set_str_option (m_inner_ctxt, opt, value); > + > +} I wondered if this should take a std::string instead of a const char *, but a const char * is probably more flexible, given that you can go trivially from a std::string to a const char *, but going the other way may cost some cycles. > diff --git a/gcc/jit/libgccjit.h b/gcc/jit/libgccjit.h > index ed6390e..da3a2bf 100644 > --- a/gcc/jit/libgccjit.h > +++ b/gcc/jit/libgccjit.h > @@ -139,6 +139,11 @@ enum gcc_jit_str_option > messages to stderr. If NULL, or default, "libgccjit.so" is used. */ > GCC_JIT_STR_OPTION_PROGNAME, > > + /* Additional files added to the link command line. To be used to > + name libraries needed to satisfy dependencies in the generated > + code. */ > + GCC_JIT_STR_OPTION_LINKFILE, This descriptive comment needs fleshing out. For example, are these filenames, or SONAMEs? How does this relate to what a user would pass to the linker command line if they were writing a Makefile rather than code that's calling into a JIT API? It's usually best to give a concrete example. At that point, the text may be rather long for a comment in a .h, so it may be worth moving the bulk of it to the .rst documentation, and having the .h comment summarize it and say to see the documentation for more details. The name of the option could be made more descriptive; if the intent here is that this is an additional thing to link to, could this be: GCC_JIT_STR_OPTION_EXTRA_LIBRARY or somesuch? Or could this be a new API entrypoint altogether e.g. gcc_jit_context_requires_library (gcc_jit_context *ctxt, const char *soname); ? (maybe with extra params? flags?) One other issue here is that I've deliberately been a bit coy in the user-facing documentation about the fact that the linker runs at all. The assembler and linker show up in the profile, and at some point it may be worth embedding them in-process somehow, so I wanted to try to keep some of this as an implementation detail that's subject to change. The user-facing docs do mention that a .so file is created. > GCC_JIT_NUM_STR_OPTIONS > }; > > @@ -152,6 +157,11 @@ enum gcc_jit_int_option > The default value is 0 (unoptimized). */ > GCC_JIT_INT_OPTION_OPTIMIZATION_LEVEL, > > + /* Optimization level for the linker. > + > + The default value is 0 (unoptimized). */ > + GCC_JIT_INT_OPTION_LINK_OPTIMIZATION_LEVEL, > + > GCC_JIT_NUM_INT_OPTIONS > }; Thanks; hope the above made sense Dave
On Fri, Dec 5, 2014 at 2:24 PM, David Malcolm <dmalcolm@redhat.com> wrote: > What's the use-case here? Sorry if I'm not getting at what this is for. The use case is that a program wants to use library functions, something common, not everything is self-contained and linked in automatically (like libc). Currently you would have to rely on the fact that a DSO can be created with dangling references which are expected to be somehow fulfilled at runtime. There are multiple problems with this: First, even if the application using the JIT itself is linked against a library which the JIT-generated code wants to use it is a problem if the definitions are accidentally found. If the library with the desired function in question uses symbol versioning the JIT-created DSO would have just an ordinary UNDEF entry for the symbol with no symbol version available. This then means that at runtime the /oldest/ version is picked. That not what you want in this case. Second, if you implement some form of extension language where the language allows to reference functions in other DSOs, you'd have to either use dlopen(RTLD_GLOBAL) in the main app (evil, ever use RTLD_GLOBAL) or you'd have to implicitly have the generated code use dlopen() and the dlsym(). That's cumbersome at best and also slow. On the other hand, with an option as proposed the code generator could simply record the dependency and have the DSO automatically used at link-time and runtime, creating the correct references etc. > Is the "self-containedness of the DSO" in your patch aimed at ensuring > that libpng.so.N gets unloaded when fake.so is unloaded? The unloading part is a nice additional benefit. It's mostly about the possibility to make it easily and quickly possible to call any function from any available DSO without having to know which DSOs are needed at the time the application using the JIT is linked. > One issue here is the lifetime of str options; currently str options > simply record the const char *, without taking a copy of the underlying > buffer. We might need to change this to make it take a strdup of the > option, to avoid nasty surprises if someone calls set_str_option with a > std::string and has it auto-coerced to a const char * from under them. I'm fine with that, I just followed what you did so far. If you want it done this way I'll add this to the patch. > New options should be documented in: > gcc/jit/docs/topics/contexts.rst in the "Options" section. > and these ones should probably be mentioned in the subsection on > GCC_JIT_FUNCTION_IMPORTED in functions.rst. I was more concerned with the code first... ;-) > Do you have a sense of what impact setting the option would have on the > time taken by gcc_jit_context_compile? It's really not much. The linker just tries different sizes for a hash table and picks the size with the least number of conflicts and therefore hopefully best performance at runtime. With today's machines this isn't really noticeable. Jakub (if you read this), when did we implement this? It still might not be a good idea to enable it by default and, as written, there might be other optimizations which are implemented. > This doesn't support nested contexts; presumably this should walk up > through any parent contexts, adding any linkfiles requested by them? Nested contexts? Do you deal with with gcc_jit_contact structures recursively? I must miss that. This is just a way to add more strings (free-form parameters) to the linker command line. I'm using ctxt.set_str_option(GCC_JIT_STR_OPTION_LINKFILE, "-lsomelibrary"); to have fake.so linked against libsomelibrary.so. > Here's another place where nested contexts may need to be supported: a > playback context's m_recording_ctxt may have ancestors, and they might > have linkfiles specified. This isn't the playback context structure, it the toplevel (gccjit::context) one. As far I can see there is no hierarchy and this makes sense. > I notice that this string option works differently from the others, in > that it appends to a list, rather than overwriting a value; that would > need spelling out in the documentation. Yes, sure, documentation is nothing I've concerned myself at that point. > I wondered if this should take a std::string instead of a const char *, > but a const char * is probably more flexible, given that you can go > trivially from a std::string to a const char *, but going the other way > may cost some cycles. If we want to make copies anyway I think it doesn't matter. I think using const char* is easier to use for the reasons you spelled out. > This descriptive comment needs fleshing out. For example, are these > filenames, or SONAMEs? How does this relate to what a user would pass > to the linker command line if they were writing a Makefile rather than > code that's calling into a JIT API? The strings are supposed to be exactly what you would add to the linker command line. No magic. In fact, the same mechanism can be used to pass arbitrary other options to the linker. These are just strings. An open question therefore is: should that be prohibited? Doubtful in my mind since it's not really possible to distinguish between linker options and filenames. "-d" is a valid file name. And hardcoding all known linker options and not allowing them is so against good practices. Aside, it might be necessary to define runpath using -Wl,-R,some/path. > It's usually best to give a concrete example. In the comments? In the documentation for sure. > The name of the option could be made more descriptive; if the intent > here is that this is an additional thing to link to, could this be: > GCC_JIT_STR_OPTION_EXTRA_LIBRARY > or somesuch? But it won't always be libraries. Maybe it's a .o file. > Or could this be a new API entrypoint altogether e.g. > > gcc_jit_context_requires_library (gcc_jit_context *ctxt, > const char *soname); That's doable. I was playing with the idea but it's less flexible. It would also means that another interface to add .o files should be added. The benefit is that it could be checked at the time of the call whether the respective file is what the caller wants to use it for and whether it exists in the first place. On the plus side, this would allow adding an extra parameter to the interface to specify at the same time the runpath. > One other issue here is that I've deliberately been a bit coy in the > user-facing documentation about the fact that the linker runs at all. > The assembler and linker show up in the profile, and at some point it > may be worth embedding them in-process somehow, so I wanted to try to > keep some of this as an implementation detail that's subject to change. > The user-facing docs do mention that a .so file is created. Understandable and this is what I wrote the (yet unfinished) code in elfutils for. Both the linker and assembler code can trivially be changed into libraries. So, a general "pass a string" option might indeed give rise to some unintended uses. So perhaps adding the two new interfaces is indeed what we should do. If you agree, I can hack that up.
diff --git a/gcc/jit/jit-playback.c b/gcc/jit/jit-playback.c index ecdae80..9c4e45f 100644 --- a/gcc/jit/jit-playback.c +++ b/gcc/jit/jit-playback.c @@ -1726,18 +1726,19 @@ convert_to_dso (const char *ctxt_progname) TV_ASSEMBLE. */ auto_timevar assemble_timevar (TV_ASSEMBLE); const char *errmsg; - const char *argv[7]; + auto_vec <const char *> argvec; +#define ADD_ARG(arg) argvec.safe_push (arg) int exit_status = 0; int err = 0; const char *gcc_driver_name = GCC_DRIVER_NAME; - argv[0] = gcc_driver_name; - argv[1] = "-shared"; + ADD_ARG (gcc_driver_name); + ADD_ARG ("-shared"); /* The input: assembler. */ - argv[2] = m_path_s_file; + ADD_ARG (m_path_s_file); /* The output: shared library. */ - argv[3] = "-o"; - argv[4] = m_path_so_file; + ADD_ARG ("-o"); + ADD_ARG (m_path_so_file); /* Don't use the linker plugin. If running with just a "make" and not a "make install", then we'd @@ -1746,17 +1747,39 @@ convert_to_dso (const char *ctxt_progname) libto_plugin is a .la at build time, with it becoming installed with ".so" suffix: i.e. it doesn't exist with a .so suffix until install time. */ - argv[5] = "-fno-use-linker-plugin"; + ADD_ARG ("-fno-use-linker-plugin"); + + /* Linker int options. */ + switch (get_int_option (GCC_JIT_INT_OPTION_LINK_OPTIMIZATION_LEVEL)) + { + default: + add_error (NULL, + "unrecognized linker optimization level: %i", + get_int_option (GCC_JIT_INT_OPTION_LINK_OPTIMIZATION_LEVEL)); + return; + + case 0: + break; + + case 1: + ADD_ARG ("-Wl,-O"); + break; + } + + const char *elt; + const auto_vec<const char *>& linkfiles = get_linkfiles(); + for (unsigned ix = 0; linkfiles.iterate(ix, &elt); ++ix) + ADD_ARG (elt); /* pex argv arrays are NULL-terminated. */ - argv[6] = NULL; + ADD_ARG (NULL); /* pex_one's error-handling requires pname to be non-NULL. */ gcc_assert (ctxt_progname); errmsg = pex_one (PEX_SEARCH, /* int flags, */ gcc_driver_name, - const_cast<char * const *> (argv), + const_cast <char *const *> (argvec.address ()), ctxt_progname, /* const char *pname */ NULL, /* const char *outname */ NULL, /* const char *errname */ @@ -1783,6 +1806,7 @@ convert_to_dso (const char *ctxt_progname) getenv ("PATH")); return; } +#undef ADD_ARG } /* Top-level hook for playing back a recording context. diff --git a/gcc/jit/jit-playback.h b/gcc/jit/jit-playback.h index 02f08ba..2726347 100644 --- a/gcc/jit/jit-playback.h +++ b/gcc/jit/jit-playback.h @@ -175,6 +175,12 @@ public: return m_recording_ctxt->get_bool_option (opt); } + const auto_vec <const char *>& + get_linkfiles () const + { + return m_recording_ctxt->get_linkfiles (); + } + builtins_manager *get_builtins_manager () const { return m_recording_ctxt->get_builtins_manager (); diff --git a/gcc/jit/jit-recording.c b/gcc/jit/jit-recording.c index 82ec399..a6d64f9 100644 --- a/gcc/jit/jit-recording.c +++ b/gcc/jit/jit-recording.c @@ -827,7 +827,17 @@ recording::context::set_str_option (enum gcc_jit_str_option opt, "unrecognized (enum gcc_jit_str_option) value: %i", opt); return; } + + switch (opt) + { + default: m_str_options[opt] = value; + break; + + case GCC_JIT_STR_OPTION_LINKFILE: + m_linkfiles.safe_push (value); + break; + } } /* Set the given integer option for this context, or add an error if diff --git a/gcc/jit/jit-recording.h b/gcc/jit/jit-recording.h index 31fb304..4b21248 100644 --- a/gcc/jit/jit-recording.h +++ b/gcc/jit/jit-recording.h @@ -209,6 +209,12 @@ public: return m_bool_options[opt]; } + const auto_vec<const char *>& + get_linkfiles (void) const + { + return m_linkfiles; + } + result * compile (); @@ -249,6 +255,7 @@ private: const char *m_str_options[GCC_JIT_NUM_STR_OPTIONS]; int m_int_options[GCC_JIT_NUM_INT_OPTIONS]; bool m_bool_options[GCC_JIT_NUM_BOOL_OPTIONS]; + auto_vec <const char *> m_linkfiles; /* Recorded API usage. */ auto_vec<memento *> m_mementos; @@ -1584,4 +1591,3 @@ private: } // namespace gcc #endif /* JIT_RECORDING_H */ - diff --git a/gcc/jit/libgccjit++.h b/gcc/jit/libgccjit++.h index 67ed5d5..232bb0f 100644 --- a/gcc/jit/libgccjit++.h +++ b/gcc/jit/libgccjit++.h @@ -99,6 +99,9 @@ namespace gccjit void dump_to_file (const std::string &path, bool update_locations); + void set_str_option (enum gcc_jit_str_option opt, + const char *value); + void set_int_option (enum gcc_jit_int_option opt, int value); @@ -535,6 +538,14 @@ context::dump_to_file (const std::string &path, } inline void +context::set_str_option (enum gcc_jit_str_option opt, + const char *value) +{ + gcc_jit_context_set_str_option (m_inner_ctxt, opt, value); + +} + +inline void context::set_int_option (enum gcc_jit_int_option opt, int value) { diff --git a/gcc/jit/libgccjit.h b/gcc/jit/libgccjit.h index ed6390e..da3a2bf 100644 --- a/gcc/jit/libgccjit.h +++ b/gcc/jit/libgccjit.h @@ -139,6 +139,11 @@ enum gcc_jit_str_option messages to stderr. If NULL, or default, "libgccjit.so" is used. */ GCC_JIT_STR_OPTION_PROGNAME, + /* Additional files added to the link command line. To be used to + name libraries needed to satisfy dependencies in the generated + code. */ + GCC_JIT_STR_OPTION_LINKFILE, + GCC_JIT_NUM_STR_OPTIONS }; @@ -152,6 +157,11 @@ enum gcc_jit_int_option The default value is 0 (unoptimized). */ GCC_JIT_INT_OPTION_OPTIMIZATION_LEVEL, + /* Optimization level for the linker. + + The default value is 0 (unoptimized). */ + GCC_JIT_INT_OPTION_LINK_OPTIMIZATION_LEVEL, + GCC_JIT_NUM_INT_OPTIONS };