Message ID | 1561617445-9328-1-git-send-email-indu.bhagat@oracle.com |
---|---|
Headers | show |
Series | Support for CTF in GCC | expand |
Ping. Can someone please review these patches ? We would like to get the support for CTF integrated soon. Thanks Indu On Wed, Jun 26, 2019 at 11:38 PM Indu Bhagat <indu.bhagat@oracle.com> wrote: > > Hello, > > This patch series adds support for CTF generation in GCC. > > [Changes from V2] > - Patch 1, 2, and 3 have minor edits if any. > - Patch 4 is a new addition. > - Patch 5 is a new addition. > > Summary of the GCC RFC V3 patch set : > Patch 1, 2, and 3 do the preparatory work of adding the CTF command line options > and setting up the framework for CTF generation and emission. More details on > these patches can be seen in the previous posting > https://gcc.gnu.org/ml/gcc-patches/2019-06/msg00718.html > > With Patch 4 in the current set, the compiler can generate a .ctf section for a > single compilation unit if -gt (when unspecified, LEVEL defaults to 2) or -gt2 > is specified. Recall that -gt2 produces type information for entities > (functions, variables etc.) at file-scope or global-scope. > > For each translation unit, a CTF container (ctf_container_t) is used to > keep the generated CTF. Two hash_map structures are kept to hold the generated > CTF for type and variables. CTF does need pre-processing before emission into > a section; there are code comments in ctfout.c to help understand this. > > There are a couple of TBDs and FIXMEs in Patch 4 which will be resolved as I > progress further; Inputs on some of which will be very helpful : > > - ctf_dtdef_hash : The compiler uses a hashing scheme to keep track of whether > CTF has been generated for a type of decl. For a type, the hashing scheme > uses TYPE_UID, but for a decl it uses htab_hash_pointer (decl). Is there a > better way to do this ? (See hash_dtd_tree_decl in ctfout.c) > > - delete_ctf_container routine in ctfout.c : I have used the GTY (()) tags in > the CTF container structs. Does this ensure that if I set the CTF container > global variable (ctfc) to NULL, the garbage collection machinery will take > care of cleaning up the the internals of the container (including hash_map). > Haven't been able to get a definitive answer looking at the code in > hash-map.h and the generated code in gtype-desc.c. > > Testing : > - Bootstrapped and regression tested on x86_64/linux and aarch64/linux. > Also bootstrapped on SPARC64/linux with some testing. > - Parsed .ctf sections of libdtrace-ctf files via a CTF dumping utility on > x86_64/linux. This simply ensures that the CTF sections are well-formed. > - Interaction with an internally available GDB looks promising. Basic whatis > and ptype tests work. GDB patches to uptake CTF debug info are in the works > and will be upstreamed soon. > > In the subsequent patches, I intend to close some open ends in the current > patch and add LTO support. > > Thanks, > > Indu Bhagat (5): > Add new function lang_GNU_GIMPLE > Add CTF command line options : -gtLEVEL > Setup for CTF generation and emission > CTF generation for a single compilation unit > Update CTF testsuite > > gcc/ChangeLog | 91 + > gcc/Makefile.in | 5 + > gcc/cgraphunit.c | 12 +- > gcc/common.opt | 9 + > gcc/ctfcreate.c | 526 ++++++ > gcc/ctfout.c | 1739 ++++++++++++++++++++ > gcc/ctfout.h | 359 ++++ > gcc/ctfutils.c | 198 +++ > gcc/doc/invoke.texi | 16 + > gcc/flag-types.h | 13 + > gcc/gengtype.c | 4 +- > gcc/langhooks.c | 9 + > gcc/langhooks.h | 1 + > gcc/opts.c | 26 + > gcc/passes.c | 7 +- > gcc/testsuite/ChangeLog | 35 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-1.c | 6 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-2.c | 10 + > .../gcc.dg/debug/ctf/ctf-anonymous-struct-1.c | 23 + > .../gcc.dg/debug/ctf/ctf-anonymous-union-1.c | 26 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-array-1.c | 31 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-1.c | 30 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-2.c | 39 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-1.c | 44 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-2.c | 30 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-3.c | 41 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-enum-1.c | 21 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-float-1.c | 16 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-forward-1.c | 36 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-forward-2.c | 16 + > .../gcc.dg/debug/ctf/ctf-function-pointers-1.c | 24 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-functions-1.c | 34 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-int-1.c | 17 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-pointers-1.c | 26 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-preamble-1.c | 11 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-str-table-1.c | 26 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-1.c | 25 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-2.c | 30 + > .../gcc.dg/debug/ctf/ctf-struct-array-1.c | 36 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-typedef-1.c | 23 + > .../gcc.dg/debug/ctf/ctf-typedef-struct-1.c | 12 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-union-1.c | 14 + > gcc/testsuite/gcc.dg/debug/ctf/ctf-variables-1.c | 25 + > gcc/testsuite/gcc.dg/debug/ctf/ctf.exp | 41 + > gcc/testsuite/gcc.dg/debug/dwarf2-ctf-1.c | 7 + > gcc/toplev.c | 18 + > include/ChangeLog | 8 + > include/ctf.h | 487 ++++++ > 48 files changed, 4277 insertions(+), 6 deletions(-) > create mode 100644 gcc/ctfcreate.c > create mode 100644 gcc/ctfout.c > create mode 100644 gcc/ctfout.h > create mode 100644 gcc/ctfutils.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-2.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-anonymous-struct-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-anonymous-union-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-bitfields-2.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-2.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-3.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-enum-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-float-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-forward-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-forward-2.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-function-pointers-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-functions-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-int-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-pointers-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-preamble-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-str-table-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-2.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-array-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-typedef-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-typedef-struct-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-union-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-variables-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf.exp > create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-ctf-1.c > create mode 100644 include/ctf.h > > -- > 1.8.3.1 >
On 7/2/19 11:54 AM, Indu Bhagat wrote: > Ping. > Can someone please review these patches ? We would like to get the > support for CTF integrated soon. I'm not sure there's really even consensus that we want CTF support in GCC. Though I think that the changes you've made in the last several weeks do make it somewhat more palatable. But ultimately the first step is to get that consensus. I'd hazard a guess that Jakub in particular isn't on board as he's been pushing to some degree for post-processing or perhaps doing it via a plug in. Richi has been guiding you a bit through how to make the changes easier to integrate, but I haven't seen him state one way or the other his preference on whether or not CTF support is something we want. I'm hesitant to add CTF support in GCC, but can understand how it might be useful given the kernel's aversion to everything dwarf. But if the kernel is the primary consumer than I'd lean towards post-processing. Jeff
On Wed, Jul 3, 2019 at 5:18 AM Jeff Law <law@redhat.com> wrote: > > On 7/2/19 11:54 AM, Indu Bhagat wrote: > > Ping. > > Can someone please review these patches ? We would like to get the > > support for CTF integrated soon. > I'm not sure there's really even consensus that we want CTF support in > GCC. Though I think that the changes you've made in the last several > weeks do make it somewhat more palatable. But ultimately the first step > is to get that consensus. > > I'd hazard a guess that Jakub in particular isn't on board as he's been > pushing to some degree for post-processing or perhaps doing it via a > plug in. > > Richi has been guiding you a bit through how to make the changes easier > to integrate, but I haven't seen him state one way or the other his > preference on whether or not CTF support is something we want. I'm mostly worried about the lack of a specification and the appearant restriction on a subset of C (the patches have gcc_unreachable () in paths that can be reached by VECTOR_TYPE or COMPLEX_TYPE not to mention FIXED_POINT_TYPE, etc...). While CTF might be easy and fast to parse and small I fear it will go the STABS way of being not extensible and bitrotten. Given it appears to generate only debug info for symbols and no locations or whatnot it should be sufficient to introspect the compilation to generate the CTF info on the side and then merge it in at link-time. Which makes me wonder if this shouldn't be a plugin for now until it is more complete and can be evaluated better (comments in the patches indicate even the on-disk format is in flux?). Adding plugin hook invocations to the three places the CTF info generation hooks off should be easy. That said, the patch series isn't ready for integration since it will crash left and right -- did you bootstrap and run the testsuite with -gt? Richard. > I'm hesitant to add CTF support in GCC, but can understand how it might > be useful given the kernel's aversion to everything dwarf. But if the > kernel is the primary consumer than I'd lean towards post-processing. > > Jeff >
On 07/02/2019 08:18 PM, Jeff Law wrote: > On 7/2/19 11:54 AM, Indu Bhagat wrote: >> Ping. >> Can someone please review these patches ? We would like to get the >> support for CTF integrated soon. > I'm not sure there's really even consensus that we want CTF support in > GCC. Though I think that the changes you've made in the last several > weeks do make it somewhat more palatable. But ultimately the first step > is to get that consensus. Thanks for your message. Absolutely, consensus is the first step. We are happy to take all the constructive feedback and answer all the concerns to make certain that CTF support in toolchain will be a useful and worthwhile contribution. > > I'd hazard a guess that Jakub in particular isn't on board as he's been > pushing to some degree for post-processing or perhaps doing it via a > plug in. > > Richi has been guiding you a bit through how to make the changes easier > to integrate, but I haven't seen him state one way or the other his > preference on whether or not CTF support is something we want. > > I'm hesitant to add CTF support in GCC, but can understand how it might > be useful given the kernel's aversion to everything dwarf. But if the > kernel is the primary consumer than I'd lean towards post-processing. > Kernel is just *one* of the consumers. There are other applications, external and internal to Oracle, that have shown interest. Not just that, a couple of distro and package maintainers have shown interest in enabling CTF by default. Post-processing in kernel and other internally available large applications has been a deterrent for adoption because of high space and compile-time costs. I answered some of Jakub's concerns in the post here https://gcc.gnu.org/ml/gcc-patches/2019-06/msg00131.html. I would even argue that the usecases will only grow if CTF is properly supported in the toolchain. Thanks
On 07/03/2019 05:31 AM, Richard Biener wrote: > On Wed, Jul 3, 2019 at 5:18 AM Jeff Law <law@redhat.com> wrote: >> On 7/2/19 11:54 AM, Indu Bhagat wrote: >>> Ping. >>> Can someone please review these patches ? We would like to get the >>> support for CTF integrated soon. >> I'm not sure there's really even consensus that we want CTF support in >> GCC. Though I think that the changes you've made in the last several >> weeks do make it somewhat more palatable. But ultimately the first step >> is to get that consensus. >> >> I'd hazard a guess that Jakub in particular isn't on board as he's been >> pushing to some degree for post-processing or perhaps doing it via a >> plug in. >> >> Richi has been guiding you a bit through how to make the changes easier >> to integrate, but I haven't seen him state one way or the other his >> preference on whether or not CTF support is something we want. > I'm mostly worried about the lack of a specification and the appearant > restriction on a subset of C (the patches have gcc_unreachable () > in paths that can be reached by VECTOR_TYPE or COMPLEX_TYPE > not to mention FIXED_POINT_TYPE, etc...). RE lack of specification : I cannot agree more; This does need to absolutely exist if we envision CTF support in toolchain to be useful to the community. We plan on getting to this task once the Linker changes are scoped and closer to done (~ a couple of weeks from now). Will this work ? RE subset of C : It is true that CTF format currently does leave out a very small subset of C like FIXED_POINT as you noted ( CTF does have representation for COMPLEX_TYPE, if my code paths culminate to gcc_unreachable () for that, I should fix them ). The end goal is to make it support all of C, and not just a subset. Meanwhile, I intend to make the compiler skip types when a C construct is not supported instead of crashing because of gcc_unreachable (). (You may have also noted stubs with "TBD WARN instead" notes in the patch series I sent.) > > While CTF might be easy and fast to parse and small I fear it will > go the STABS way of being not extensible and bitrotten. FWIW, I can understand this. We will maintain it. And I hope it will also be a community effort thereafter with active consumers, so there is a positive feedback loop. > > Given it appears to generate only debug info for symbols and no locations > or whatnot it should be sufficient to introspect the compilation to generate > the CTF info on the side and then merge it in at link-time. Which makes > me wonder if this shouldn't be a plugin for now until it is more complete > and can be evaluated better (comments in the patches indicate even the > on-disk format is in flux?). Adding plugin hook invocations to the three > places the CTF info generation hooks off should be easy. Yes, some bits of the on-disk format are being adapted to make it easier to adopt the CTF format across the board. E.g., we recently added CU name in the CTF header. As another example, we added CTF_K_SLICE type because there existed no way in CTF to represent enum bitfields. For the most part though, CTF format has stayed as is. Hmm...a GCC plugin for CTF generation at compile-time may work out for a single compilation unit. But I am not sure how will LTO be supported in that case. Basically, for LTO and -gtLEVEL to work together, I need the lto-wrapper to be aware of the presence of .ctf sections (so I think). I will need to combine the .ctf sections from multiple compilation units into a CTF archive, which the linker can then de-duplicate. Even if I assume that the technical hurdle in the above paragraph is solvable within the purview of a plugin, I fear worse problems of adoption, maintenance and distribution in the long run, if CTF support unfortunately ever remains to be done via a plugin for reasons unforeseen. Going the plugin route for the short term, will continue to suffer similar problems of distribution and support. - Is the plugin infrastructure supported on most platforms ? Also, I see that the plugin infrastructure supports all gcc versions from 4.5 onwards. Can someone confirm ? ( We minimally want the toolchain support with GCC 4.8.5 and GCC 8 and later, for now. ) - How will the plugin be distributed for a variety of platforms and architectures outside of what Oracle Linux commits to support ? Unless you are suggesting that the GCC plugin be distributed within GCC, meanwhile ? Well, that may be acceptable in the short term, depending on how I resolve some points raised above. > > That said, the patch series isn't ready for integration since it will > crash left and right -- did you bootstrap and run the testsuite > with -gt? > > Bootstrap and Testsuite : Yes, I have. On x86_64/linux, sparc64/linux, aarch64/linux. Run testsuite with -gt : Not yet. Believe me, it's on my plate. And I already regret not having done it sooner :) Bootstrap with -gt : Not yet. I should try soon. (I have compiled libdtrace-ctf with -gt and parsed the .ctf sections with the patch set.) About the patch being not ready for integration : Yes, you're right. That's why I chose to retain 'RFC' for this patch series as well. I am working on issues, testing the compiler, and closing on the open ends in the implementation. I will refresh the patch series when I have made a meaningful stride ahead. Any further suggestions on functional/performance testing will be helpful too. Thanks again for your reviews. Indu
On Thu, Jul 4, 2019 at 2:36 AM Indu Bhagat <indu.bhagat@oracle.com> wrote: > > > On 07/03/2019 05:31 AM, Richard Biener wrote: > > On Wed, Jul 3, 2019 at 5:18 AM Jeff Law <law@redhat.com> wrote: > >> On 7/2/19 11:54 AM, Indu Bhagat wrote: > >>> Ping. > >>> Can someone please review these patches ? We would like to get the > >>> support for CTF integrated soon. > >> I'm not sure there's really even consensus that we want CTF support in > >> GCC. Though I think that the changes you've made in the last several > >> weeks do make it somewhat more palatable. But ultimately the first step > >> is to get that consensus. > >> > >> I'd hazard a guess that Jakub in particular isn't on board as he's been > >> pushing to some degree for post-processing or perhaps doing it via a > >> plug in. > >> > >> Richi has been guiding you a bit through how to make the changes easier > >> to integrate, but I haven't seen him state one way or the other his > >> preference on whether or not CTF support is something we want. > > I'm mostly worried about the lack of a specification and the appearant > > restriction on a subset of C (the patches have gcc_unreachable () > > in paths that can be reached by VECTOR_TYPE or COMPLEX_TYPE > > not to mention FIXED_POINT_TYPE, etc...). > > RE lack of specification : I cannot agree more; This does need to absolutely exist > if we envision CTF support in toolchain to be useful to the community. > We plan on getting to this task once the Linker changes are scoped and closer > to done (~ a couple of weeks from now). Will this work ? Sure - just keep in mind that it's difficult to give feedback to something without a specification. > RE subset of C : It is true that CTF format currently does leave out a very > small subset of C like FIXED_POINT as you noted ( CTF does have representation > for COMPLEX_TYPE, if my code paths culminate to gcc_unreachable () for that, I > should fix them ). The end goal is to make it support all of C, and not just a > subset. What about other languages? GCC supports C++, Ada, Objective-C, Go, D, Fortran, Modula-2, BRIG (this list is not necessarily complete and may change in the future). > Meanwhile, I intend to make the compiler skip types when a C construct is not > supported instead of crashing because of gcc_unreachable (). (You may have also > noted stubs with "TBD WARN instead" notes in the patch series I sent.) > > > > > > While CTF might be easy and fast to parse and small I fear it will > > go the STABS way of being not extensible and bitrotten. > > FWIW, I can understand this. We will maintain it. And I hope it will also be a > community effort thereafter with active consumers, so there is a positive > feedback loop. > > > > > Given it appears to generate only debug info for symbols and no locations > > or whatnot it should be sufficient to introspect the compilation to generate > > the CTF info on the side and then merge it in at link-time. Which makes > > me wonder if this shouldn't be a plugin for now until it is more complete > > and can be evaluated better (comments in the patches indicate even the > > on-disk format is in flux?). Adding plugin hook invocations to the three > > places the CTF info generation hooks off should be easy. > > Yes, some bits of the on-disk format are being adapted to make it easier to > adopt the CTF format across the board. E.g., we recently added CU name in the > CTF header. As another example, we added CTF_K_SLICE type because there existed > no way in CTF to represent enum bitfields. For the most part though, CTF format > has stayed as is. I hope the format is versioned at least. > Hmm...a GCC plugin for CTF generation at compile-time may work out for a single > compilation unit. But I am not sure how will LTO be supported in that case. > Basically, for LTO and -gtLEVEL to work together, I need the lto-wrapper to be > aware of the presence of .ctf sections (so I think). I will need to combine the > .ctf sections from multiple compilation units into a CTF archive, which the > linker can then de-duplicate. True. lto-wrapper does this kind of dancing for the much more complex set of DWARF sections already. > Even if I assume that the technical hurdle in the above paragraph is solvable > within the purview of a plugin, I fear worse problems of adoption, maintenance > and distribution in the long run, if CTF support unfortunately ever remains to be > done via a plugin for reasons unforeseen. > > Going the plugin route for the short term, will continue to suffer similar > problems of distribution and support. > > - Is the plugin infrastructure supported on most platforms ? Also, I see that > the plugin infrastructure supports all gcc versions from 4.5 onwards. > Can someone confirm ? ( We minimally want the toolchain support with > GCC 4.8.5 and GCC 8 and later, for now. ) The infrastructure is quite old but you'd need new invocation hooks so this won't help. > - How will the plugin be distributed for a variety of platforms and > architectures outside of what Oracle Linux commits to support ? > > Unless you are suggesting that the GCC plugin be distributed within GCC, > meanwhile ? Well, that may be acceptable in the short term, depending on how > I resolve some points raised above. > > > > > That said, the patch series isn't ready for integration since it will > > crash left and right -- did you bootstrap and run the testsuite > > with -gt? > > > > > Bootstrap and Testsuite : Yes, I have. On x86_64/linux, sparc64/linux, > aarch64/linux. > Run testsuite with -gt : Not yet. Believe me, it's on my plate. And I already > regret not having done it sooner :) > Bootstrap with -gt : Not yet. I should try soon. > > (I have compiled libdtrace-ctf with -gt and parsed the .ctf sections with the > patch set.) > > About the patch being not ready for integration : Yes, you're right. > That's why I chose to retain 'RFC' for this patch series as well. I am working > on issues, testing the compiler, and closing on the open ends in the > implementation. > > I will refresh the patch series when I have made a meaningful stride ahead. Any > further suggestions on functional/performance testing will be helpful too. What's the functional use of CTF? Print nice backtraces (without showing function argument values)? Richard. > Thanks again for your reviews. > > Indu >
On 07/04/2019 03:43 AM, Richard Biener wrote: > On Thu, Jul 4, 2019 at 2:36 AM Indu Bhagat<indu.bhagat@oracle.com> wrote: >> [...] >> RE subset of C : It is true that CTF format currently does leave out a very >> small subset of C like FIXED_POINT as you noted ( CTF does have representation >> for COMPLEX_TYPE, if my code paths culminate to gcc_unreachable () for that, I >> should fix them ). The end goal is to make it support all of C, and not just a >> subset. > What about other languages? GCC supports C++, Ada, Objective-C, Go, D, > Fortran, Modula-2, BRIG (this list is not necessarily complete and may change > in the future). The format supports C only at this time. Other languages are not on the radar yet. However, we have no intrinsic objection to them. Although, languages that already have fully-fledged type introspection and interpreted/ managed languages are probably out of scope, since they already have what CTF provides. > >> >>> Given it appears to generate only debug info for symbols and no locations >>> or whatnot it should be sufficient to introspect the compilation to generate >>> the CTF info on the side and then merge it in at link-time. Which makes >>> me wonder if this shouldn't be a plugin for now until it is more complete >>> and can be evaluated better (comments in the patches indicate even the >>> on-disk format is in flux?). Adding plugin hook invocations to the three >>> places the CTF info generation hooks off should be easy. >> Yes, some bits of the on-disk format are being adapted to make it easier to >> adopt the CTF format across the board. E.g., we recently added CU name in the >> CTF header. As another example, we added CTF_K_SLICE type because there existed >> no way in CTF to represent enum bitfields. For the most part though, CTF format >> has stayed as is. > I hope the format is versioned at least. Yes, the format is versioned. The current version is CTF_VERSION_3. All these format changes I talked about above are a part of CTF_VERSION_3. libctf handles backward compatibility for users of CTF in the toolchain; all transparently to the user. This means that, in future, when CTF version needs to be bumped, libctf will either support older version and/or transparently upgrade to the new version for further consumers. It also means that the compiler does not always need to change merely because the format has changed: (depending on the change) the linker can transparently adjust, as will all consumers if they try to read unlinked object files. > >>> That said, the patch series isn't ready for integration since it will >>> crash left and right -- did you bootstrap and run the testsuite >>> with -gt? >>> >>> >> Bootstrap and Testsuite : Yes, I have. On x86_64/linux, sparc64/linux, >> aarch64/linux. >> Run testsuite with -gt : Not yet. Believe me, it's on my plate. And I already >> regret not having done it sooner :) >> Bootstrap with -gt : Not yet. I should try soon. >> >> (I have compiled libdtrace-ctf with -gt and parsed the .ctf sections with the >> patch set.) >> >> About the patch being not ready for integration : Yes, you're right. >> That's why I chose to retain 'RFC' for this patch series as well. I am working >> on issues, testing the compiler, and closing on the open ends in the >> implementation. >> >> I will refresh the patch series when I have made a meaningful stride ahead. Any >> further suggestions on functional/performance testing will be helpful too. > What's the functional use of CTF? Print nice backtraces (without showing > function argument values)? > CTF, at this time, is type information for entities at global or file scope. This can be used by online debuggers, program tracers (dynamic tracing); More generally, it provides type introspection for C programs, with an optional library API to allow them to get at their own types quite more easily than DWARF. So, the umbrella usecases are - all C programs that want to introspect their own types quickly; and applications that want to introspect other programs's types quickly. (Even with the exception of its embedded string table, it is already small enough to be kept around in stripped binaries so that it can be relied upon to be present.) We are also extending the format so it is useful for other on-line debugging tools, such as backtracers. Indu
On Fri, Jul 5, 2019 at 12:21 AM Indu Bhagat <indu.bhagat@oracle.com> wrote: > > On 07/04/2019 03:43 AM, Richard Biener wrote: > > On Thu, Jul 4, 2019 at 2:36 AM Indu Bhagat <indu.bhagat@oracle.com> wrote: > > [...] > > RE subset of C : It is true that CTF format currently does leave out a very > small subset of C like FIXED_POINT as you noted ( CTF does have representation > for COMPLEX_TYPE, if my code paths culminate to gcc_unreachable () for that, I > should fix them ). The end goal is to make it support all of C, and not just a > subset. > > What about other languages? GCC supports C++, Ada, Objective-C, Go, D, > Fortran, Modula-2, BRIG (this list is not necessarily complete and may change > in the future). > > The format supports C only at this time. Other languages are not on the radar > yet. However, we have no intrinsic objection to them. Although, languages > that already have fully-fledged type introspection and interpreted/ > managed languages are probably out of scope, since they already have > what CTF provides. > > > > Given it appears to generate only debug info for symbols and no locations > or whatnot it should be sufficient to introspect the compilation to generate > the CTF info on the side and then merge it in at link-time. Which makes > me wonder if this shouldn't be a plugin for now until it is more complete > and can be evaluated better (comments in the patches indicate even the > on-disk format is in flux?). Adding plugin hook invocations to the three > places the CTF info generation hooks off should be easy. > > Yes, some bits of the on-disk format are being adapted to make it easier to > adopt the CTF format across the board. E.g., we recently added CU name in the > CTF header. As another example, we added CTF_K_SLICE type because there existed > no way in CTF to represent enum bitfields. For the most part though, CTF format > has stayed as is. > > I hope the format is versioned at least. > > Yes, the format is versioned. The current version is CTF_VERSION_3. All these > format changes I talked about above are a part of CTF_VERSION_3. > > libctf handles backward compatibility for users of CTF in the toolchain; all > transparently to the user. This means that, in future, when CTF version needs > to be bumped, libctf will either support older version and/or transparently > upgrade to the new version for further consumers. > > It also means that the compiler does not always need to change merely because > the format has changed: (depending on the change) the linker can transparently > adjust, as will all consumers if they try to read unlinked object files. > > > That said, the patch series isn't ready for integration since it will > crash left and right -- did you bootstrap and run the testsuite > with -gt? > > > Bootstrap and Testsuite : Yes, I have. On x86_64/linux, sparc64/linux, > aarch64/linux. > Run testsuite with -gt : Not yet. Believe me, it's on my plate. And I already > regret not having done it sooner :) > Bootstrap with -gt : Not yet. I should try soon. > > (I have compiled libdtrace-ctf with -gt and parsed the .ctf sections with the > patch set.) > > About the patch being not ready for integration : Yes, you're right. > That's why I chose to retain 'RFC' for this patch series as well. I am working > on issues, testing the compiler, and closing on the open ends in the > implementation. > > I will refresh the patch series when I have made a meaningful stride ahead. Any > further suggestions on functional/performance testing will be helpful too. > > What's the functional use of CTF? Print nice backtraces (without showing > function argument values)? > > CTF, at this time, is type information for entities at global or file scope. > This can be used by online debuggers, program tracers (dynamic tracing); More > generally, it provides type introspection for C programs, with an optional > library API to allow them to get at their own types quite more easily than > DWARF. So, the umbrella usecases are - all C programs that want to introspect > their own types quickly; and applications that want to introspect other > programs's types quickly. What makes it superior to DWARF stripped down to the above feature set? > (Even with the exception of its embedded string table, it is already small > enough to be kept around in stripped binaries so that it can be relied upon > to be present.) So for distributing a program/library for SUSE we usually split the distribution into two pieces - the binaries and separated debug information. With CTF we'd then create both, continue stripping out the DWARF information but keep the CTF in the binaries? When a program contains CTF only, can gdb do anything to help debugging of a running program or a core file? Do you have gdb support in the works? > We are also extending the format so it is useful for other on-line debugging > tools, such as backtracers. So you become more complex similar to DWARF? Richard. > > Indu
On 5 Jul 2019, Richard Biener said: > On Fri, Jul 5, 2019 at 12:21 AM Indu Bhagat <indu.bhagat@oracle.com> wrote: >> CTF, at this time, is type information for entities at global or file scope. >> This can be used by online debuggers, program tracers (dynamic tracing); More >> generally, it provides type introspection for C programs, with an optional >> library API to allow them to get at their own types quite more easily than >> DWARF. So, the umbrella usecases are - all C programs that want to introspect >> their own types quickly; and applications that want to introspect other >> programs's types quickly. > > What makes it superior to DWARF stripped down to the above feature set? Increased compactness. DWARF fundamentally trades off compactness in favour of its regular structure, which makes it easier to parse (but not easier to interpret) but very hard to make it much smaller than it is now. Where DWARF uses word-sized and larger entities for everything, CTF packs everything much more tightly -- and this is quite visible in the resulting file sizes, once deduplicated. (CTF for the entire Linux kernel is about 6MiB after gzipping, and that includes not only complete descriptions of its tens of thousands of types but also type and string table entries for every structure and union member name, every enumeration member, and every global variable. More conventional programs will be able to eschew spending space on some of these because the ELF string table already contains their names, and we reuse those where possible. Insofar as it is possible to tell, the DWARF type info for the entire kernel, even after deduplication, would be many times larger: it is certainly much larger as it comes out of the compiler. You could define a "restricted DWARF" with smaller tags etc that is smaller, but frankly that would no longer be DWARF at all.) (I'm using the kernel as an example a lot not because CTF is kernel-specific but because our *existing deduplicator* happens to be targetted at the kernel. This is already an annoying limitation: we want to be able to use CTF in userspace more easily and more widely, without kludges and without incurring huge costs to generate gigabytes of DWARF we otherwise aren't using: hence this project.) When programs try to consume DWARF from large programs the size of the kernel, even with indexes I observe a multi-second lag and significant memory usage: no program I have tried has increased its RSS by less than 100MiB. CTF consumers can suck in the CTF for the core kernel in well under a third of a second, and can traverse the CTF for the kernel and all modules (multiple CTF sections in an archive, sharing common types wiht a parent section) in about a second and a half (from a cold cache): RSS goes up by about 15MiB. If DWARF usage can impose a burden that low on consumers, it's the first I've ever heard of it. >> (Even with the exception of its embedded string table, it is already small >> enough to be kept around in stripped binaries so that it can be relied upon >> to be present.) > > So for distributing a program/library for SUSE we usually split the > distribution into two pieces - the binaries and separated debug information. > With CTF we'd then create both, continue stripping out the DWARF information > but keep the CTF in the binaries? > > When a program contains CTF only, can gdb do anything to help debugging > of a running program or a core file? Do you have gdb support in the works? Yes, and it works well enough already to extract types from programs (going all the way from symbols to types requires some code on the GCC and linker side that is being written right now, and we can't test the GDB code that relies on that until then: equally, I'm still working on the linker so this is a demo on a randomly-chosen object file. This also means you don't see any benefits from strtab reuse with the containing ELF object, CTF section compression or deduplication in the following example's .ctf section size): [nix@ca-tools3 libiberty]$ /home/ibhagat/GCC/install/gcc-ctf/bin/gcc -c -DHAVE_CONFIG_H -gt -O2 -I. -I../../libiberty/../include -W -Wall -Wwrite-strings -Wc++-compat -Wstrict-prototypes -Wshadow=local -pedantic -D_GNU_SOURCE ../../libiberty/hashtab.c -o hashtab.o [nix@ca-tools3 libiberty]$ size -A hashtab.o hashtab.o : section size addr .text 4112 0 .data 16 0 .bss 0 0 .ctf 11907 0 .rodata.str1.8 40 0 .rodata.cst8 8 0 .rodata 480 0 .comment 43 0 .note.GNU-stack 0 0 Total 16606 [nix@ca-tools3 libiberty]$ ../gdb/gdb hashtab.o GNU gdb (GDB) 8.2.50.20190214-git Copyright (C) 2019 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "sparc64-unknown-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from hashtab.o... (gdb) info types All defined types: File /home/nix/binutils-gdb/foo/libiberty/hashtab.o: struct { unsigned char __arr[8]; }; typedef struct <unknown> FILE; typedef struct { long __pos; struct {...} __state; } _G_fpos64_t; typedef struct { long __pos; struct {...} __state; } _G_fpos_t; typedef short _G_int16_t; typedef int _G_int32_t; typedef unsigned short _G_uint16_t; typedef unsigned int _G_uint32_t; typedef struct <unknown> _IO_FILE; typedef struct { long (*read)(); long (*write)(); int (*seek)(); int (*close)(); } _IO_cookie_io_functions_t; typedef void _IO_lock_t; typedef struct <unknown> __FILE; typedef struct { unsigned char __arr[2]; } __STRING2_COPY_ARR2; typedef struct { unsigned char __arr[3]; } __STRING2_COPY_ARR3; typedef struct { unsigned char __arr[4]; } __STRING2_COPY_ARR4; typedef struct { unsigned char __arr[5]; } __STRING2_COPY_ARR5; typedef struct { unsigned char __arr[6]; } __STRING2_COPY_ARR6; typedef struct { unsigned char __arr[7]; } __STRING2_COPY_ARR7; typedef struct { unsigned char __arr[8]; } __STRING2_COPY_ARR8; typedef union { union wait *__uptr; int *__iptr; } __WAIT_STATUS; typedef long __blkcnt64_t; typedef long __blkcnt_t; typedef long __blksize_t; typedef char * __caddr_t; typedef long __clock_t; typedef int __clockid_t; typedef int (*)() __compar_d_fn_t; typedef int (*)() __compar_fn_t; typedef int __daddr_t; typedef unsigned long __dev_t; typedef long __fd_mask; typedef unsigned long __fsblkcnt64_t; typedef unsigned long __fsblkcnt_t; typedef unsigned long __fsfilcnt64_t; typedef unsigned long __fsfilcnt_t; typedef struct { int __val[2]; } __fsid_t; typedef unsigned int __gid_t; typedef void * __gnuc_va_list; typedef int __gwchar_t; typedef unsigned int __id_t; typedef unsigned long __ino64_t; [... and on and on...] gdb support, like everything other than GCC, uses the libctf library in the binutils-gdb repo, which will soon enough be made a public, versioned shared library so that other consumers can pitch in (I just don't want to do that before the linker changes are upstreamed). >> We are also extending the format so it is useful for other on-line debugging >> tools, such as backtracers. > > So you become more complex similar to DWARF? Simplicity of types is not the goal. Compactness is the goal, and ease of parsing by end users once the format itself has been decoded (so nothing like the exprloc interpreter exists). We have simple data structures, sure, but they are not regular: rather they are tuned for the type system they are describing, and in some cases tuned further to maximize compactness for types that are more likely to be referenced often or occur frequently and types in the majority of non-huge programs (types used by many other types, etc). As an example (a lengthy one, sorry!), types themselves have two overlapping core representations shared by all types, with a sentinel indicating which is in use for any given type: typedef struct ctf_stype { uint32_t ctt_name; /* Reference to name in string table. */ uint32_t ctt_info; /* Encoded kind, variant length (see below). */ union { uint32_t ctt_size; /* Size of entire type in bytes. */ uint32_t ctt_type; /* Reference to another type. */ }; } ctf_stype_t; typedef struct ctf_type { uint32_t ctt_name; /* Reference to name in string table. */ uint32_t ctt_info; /* Encoded kind, variant length. */ union { uint32_t ctt_size; /* Always CTF_LSIZE_SENT. */ uint32_t ctt_type; /* Do not use. */ }; uint32_t ctt_lsizehi; /* High 32 bits of type size in bytes. */ uint32_t ctt_lsizelo; /* Low 32 bits of type size in bytes. */ } ctf_type_t; So the (very rare!) huge types pay the space in the type vector for a 64-bit type word (without requiring all users to have a uint64_t type): smaller types do not pay. You might say types so huge are so rare that this adds nothing -- but a future format extension planned well before the end of this year will add *another* layer to this, giving us three core representations, and the third one is notably smaller: typedef struct ctf_ttype { uint32_t ctt_name; /* Reference to name in string table. */ uint16_t ctt_info; /* Encoded kind, variant length. */ union { uint16_t ctt_size; /* Size of entire type in bytes. */ uint16_t ctt_type; /* Reference to another type. */ }; } ctf_ttype_t; (there is another sentinel hiding inside ctt_info to indicate when a type is represented using one of these). The compiler will not need to adapt to any of this because libctf will transparently upgrade the older format into the newer one at link time. The compiler only needs to change if the format becomes more expressive -- e.g. when support for the GNU C types you mentioned is added. This change will allow "smaller" programs (the majority of C programs) to encode types in only eight bytes per type plus similarly compact per-type variable-length data for things like structure members, down from twelve bytes now, and I can probably shrink it further, down to six bytes per type. Obviously not all types can be this compact: things like complex types fall back to the larger form, as do huge types and types that reference types with high IDs. But DWARF needs really quite a lot more, even for simple types, and there can be many thousands of them. Structure and union members use similar trickery: as a result of all this, even now, our biggest space consumer is the strtab giving the names of the structure members! The backtrace section, when it is designed, will follow a similar philosophy. Surprisingly, this sort of bit-shaving actually saves significant space even when the section is compressed: it seems Huffman dictionaries can't always elide small runs of high-byte zeroes...
On Fri, Jul 05, 2019 at 07:28:12PM +0100, Nix wrote: > > What makes it superior to DWARF stripped down to the above feature set? > > Increased compactness. DWARF fundamentally trades off compactness in > favour of its regular structure, which makes it easier to parse (but not > easier to interpret) but very hard to make it much smaller than it is > now. Where DWARF uses word-sized and larger entities for everything, CTF > packs everything much more tightly -- and this is quite visible in the That is just not true, most of the data in DWARF are uleb128/sleb128 encoded, or often even present just in the abbreviation table (DW_FORM_flag_present, DW_FORM_implicit_const), word-sized is typically only stuff that needs relocations (at assembly time and more importantly at link time). > could define a "restricted DWARF" with smaller tags etc that is smaller, > but frankly that would no longer be DWARF at all.) You can certainly decide what kind of information you emit and what you don't, it is done already (look at -g1 vs. -g2 vs. -g3), and can be done for other stuff, say if you aren't interested in any locations, DW_AT_{decl,call}_{file,line,column} can be omitted, if you aren't interested in debug info within functions, a lot of debug info can be emitted etc. And it still will be DWARF. For DWARF you also have various techniques at reducing size and optimizing redundancies away, look e.g. at the dwz utility. Jakub
On 5 Jul 2019, Jakub Jelinek outgrape: > On Fri, Jul 05, 2019 at 07:28:12PM +0100, Nix wrote: >> > What makes it superior to DWARF stripped down to the above feature set? >> >> Increased compactness. DWARF fundamentally trades off compactness in >> favour of its regular structure, which makes it easier to parse (but not >> easier to interpret) but very hard to make it much smaller than it is >> now. Where DWARF uses word-sized and larger entities for everything, CTF >> packs everything much more tightly -- and this is quite visible in the > > That is just not true, most of the data in DWARF are uleb128/sleb128 > encoded, or often even present just in the abbreviation table > (DW_FORM_flag_present, DW_FORM_implicit_const), word-sized is typically only > stuff that needs relocations (at assembly time and more importantly at link > time). Hm. I may have misread the spec. The fact remains that DWARF is (in typical usage) both large and slow to use: it is not entirely untrue to say that you can spot a DWARF consumer because it takes ten seconds to start up. This may be something that can be avoided with sufficiently clever implementations, but I've never seen any such implementation and we don't appear to be approaching one terribly fast :( meanwhile, in CTF we already have a working system that can reduce multigigabyte DWARF input down to 6MiB of compressed CTF loading in fractions of a second, though it is true that not all of that input was global-scope type info, so a large portion of that multigigabyte input would simply have been dropped and should not be considered relevant. I'm not sure how to determine how much of the input is type DIEs at global scope... (The 6MiB figure is slightly misleading, too, since only 1439845 bytes of that is type data: the rest is mostly compressed string table.) Possibly sufficiently clever deduplication can do a similar scrunching job for DWARF, but I note that what DWARF deduplication GCC did in earlier releases has subsequently been removed because it never really worked very well. (Having written code that deduplicates DWARF, I can see why: it's a complex job when you just have C to think about. Doing it for C++ as well must have made people's brains dribble out of their ears). Type signatures in DWARF 4 were supposed to provide this sort of thing, too, but yet again the promise does not seem to have been borne out: DWARF debuginfo remains immense and there is no discussion of leaving unstripped binaries on production systems for the sake of continuous tracing tools or introspection, because the debuginfo in those binaries would still be many times the size of the binaries they relate to, and obviously leaving it unstripped in that case is ridiculous. Meanwhile, FreeBSD has a leg-up in continuous debugging because they generate (an older form of) CTF for everything and deduplicate it, and it's small enough that they can leave it linked into the binaries rather than stripping it out, and tracers can and do use it. I'm trying to give us all that advantage, while not leaving us tied to a format with as many limitations as FreeBSD's CTF. As a side note, I tried switching to ULEB128 for the representations of unsigned integers in CTF a while back, but never even pushed it anywhere because while it shrank the output a little, the compressed sizes worsened noticeably, by about 10%, and we don't want to hurt the compressed sizes any more than we do the uncompressed ones. I found this quite annoying. So I'm not convinced that ULEB actually buys you much of anything once compressors get into the mix. Something similar happened when I tried to do clever things with string tables last month, sharing common string suffixes, slicing strtabs up on underscores and changes of case and replacing strings where beneficial with offset tables pointing into the sliced-up pieces: the uncompressed size shrank by about 50% and the compressed size grew by 20%... I found this *very* annoying. :) > For DWARF you also have various techniques at reducing size and optimizing > redundancies away, look e.g. at the dwz utility. ... interesting! I'll be looking through this and seeing if any of it is applicable to CTF as well, that's for sure.
On 07/04/2019 03:43 AM, Richard Biener wrote: >> Hmm...a GCC plugin for CTF generation at compile-time may work out for a single >> compilation unit. But I am not sure how will LTO be supported in that case. >> Basically, for LTO and -gtLEVEL to work together, I need the lto-wrapper to be >> aware of the presence of .ctf sections (so I think). I will need to combine the >> .ctf sections from multiple compilation units into a CTF archive, which the >> linker can then de-duplicate. > True. lto-wrapper does this kind of dancing for the much more complex set of > DWARF sections already. > >> Even if I assume that the technical hurdle in the above paragraph is solvable >> within the purview of a plugin, I fear worse problems of adoption, maintenance >> and distribution in the long run, if CTF support unfortunately ever remains to be >> done via a plugin for reasons unforeseen. >> >> Going the plugin route for the short term, will continue to suffer similar >> problems of distribution and support. >> >> - Is the plugin infrastructure supported on most platforms ? Also, I see that >> the plugin infrastructure supports all gcc versions from 4.5 onwards. >> Can someone confirm ? ( We minimally want the toolchain support with >> GCC 4.8.5 and GCC 8 and later, for now. ) > The infrastructure is quite old but you'd need new invocation hooks so this > won't help. > OK then. I will continue to focus on my current implementation without exploring the plugin option at this time. Thanks for confirming. Indu
On Jul 5, 2019, at 11:28 AM, Nix <nix@esperi.org.uk> wrote:
> ICTF for the entire Linux kernel is about 6MiB
Any reason why not add CTF to the next dwarf standard? Then, we just support the next dwarf standard. If not, have you started talks with them to add it?
Long term, this is a better solution, as we then get more interoperability, more support, more tools and more goodness.
To me this is the obvious solution to the problem.
On Fri, Jul 05, 2019 at 07:28:12PM +0100, Nix wrote: > On 5 Jul 2019, Richard Biener said: > > > On Fri, Jul 5, 2019 at 12:21 AM Indu Bhagat <indu.bhagat@oracle.com> wrote: > >> CTF, at this time, is type information for entities at global or file scope. > >> This can be used by online debuggers, program tracers (dynamic tracing); More > >> generally, it provides type introspection for C programs, with an optional > >> library API to allow them to get at their own types quite more easily than > >> DWARF. So, the umbrella usecases are - all C programs that want to introspect > >> their own types quickly; and applications that want to introspect other > >> programs's types quickly. > > > > What makes it superior to DWARF stripped down to the above feature set? > > Increased compactness. Does CTF support something like -fasynchronous-unwind-tables? You need that to have any sane debugging on many platforms. Without it, you even have only partial backtraces, on most architectures/ABIs anyway. Segher
On 7/9/19 5:25 PM, Segher Boessenkool wrote: > On Fri, Jul 05, 2019 at 07:28:12PM +0100, Nix wrote: >> On 5 Jul 2019, Richard Biener said: >> >>> On Fri, Jul 5, 2019 at 12:21 AM Indu Bhagat <indu.bhagat@oracle.com> wrote: >>>> CTF, at this time, is type information for entities at global or file scope. >>>> This can be used by online debuggers, program tracers (dynamic tracing); More >>>> generally, it provides type introspection for C programs, with an optional >>>> library API to allow them to get at their own types quite more easily than >>>> DWARF. So, the umbrella usecases are - all C programs that want to introspect >>>> their own types quickly; and applications that want to introspect other >>>> programs's types quickly. >>> >>> What makes it superior to DWARF stripped down to the above feature set? >> >> Increased compactness. > > Does CTF support something like -fasynchronous-unwind-tables? You need > that to have any sane debugging on many platforms. Without it, you > even have only partial backtraces, on most architectures/ABIs anyway. I'd be suprised if it did since you need location information. FWIW, low level libraries like glibc depend on this stuff to support cancellation. jeff
[Sorry for delay: head down in linker plus having nice food poisoning bouts] On 9 Jul 2019, Mike Stump verbalised: > On Jul 5, 2019, at 11:28 AM, Nix <nix@esperi.org.uk> wrote: >> ICTF for the entire Linux kernel is about 6MiB > > Any reason why not add CTF to the next dwarf standard? Then, we just > support the next dwarf standard. If not, have you started talks with > them to add it? A mixture of impostor syndrome, the fact that CTF is really very non-DWARFish in all sorts of ways, and the fact that CTF-the-format is changing quite fast right now means that... well, if it is to be added, now is not the time. I haven't even documented it in texi yet :) (Just suggestions for improvement I've had on the binutils list will lead to a good few changes :) ). Right now, the rule for compatibility is that libctf will always be able to read all earlier versions written by any released binutils or libdtrace-ctf, and rewrite them as the latest version -- and one improvement I have planned is that it will eventually be able to *write* older versions as well, as long as doing so doesn't lose information or run into limitations of the older format (like trying to write >2^16 types to a format v1 container, or add an enum bitfield to a v2 container). I'm doing this in the obvious fashion: every time the format written by binutils libctf changes, it keeps the ability to upgrade all older CTF formats any release of binutils ever accepted to the latest format. Every binutils release after such a change constitutes a boundary: the next format change after that will bump the CTF format version, and the just-released format will be upgraded to be compatible with any new stuff that gets added. If CTF generation support lands in GCC, I'll treat compiler releases the same way, nailing the format any released GCC emits into binutils libctf at release time and ensuring binutils libctf can always accept it (and thus binutils ld can always link it and gdb can always use it). (I do not plan to ever drop support for any older CTF formats: indeed I plan to extend it so that the FreeBSD/Solaris CTF can be read as well, and hopefully eventually written too.) This should suffice to ensure that the CTF emitted by any released compiler and any released binutils can always be accepted by newer releases, and is probably the right approach until format evolution slows and we can start to actually standardize this. > Long term, this is a better solution, as we then get more > interoperability, more support, more tools and more goodness. Agreed! I do hope libctf remains flexible and useful enough that everyone can use it as a "format oracle", but I would welcome other implementations: the more the merrier! (It's just that now might be too early and too annoying for the other implementors, since the format is evolving faster than it ever has, thanks to all the lovely suggestions on the binutils list). If libctf *does* gain the ability to downgrade as well as upgrade formats, we can keep evolving the format even after standardization, with libctf translating the standardized version to newer versions and back down again as needed, restandardizing at intervals so the other tools can catch up: this seems like a fairly strong reason to gain the ability to write out old versions as well as new ones. (But I'm getting way ahead of myself here: the internal intermediate representation inside libctf that will make this sort of format ubiquity possible only exists inside my head right now, after all.)
On 10 Jul 2019, Segher Boessenkool spake thusly: > On Fri, Jul 05, 2019 at 07:28:12PM +0100, Nix wrote: >> On 5 Jul 2019, Richard Biener said: >> >> > On Fri, Jul 5, 2019 at 12:21 AM Indu Bhagat <indu.bhagat@oracle.com> wrote: >> >> CTF, at this time, is type information for entities at global or file scope. >> >> This can be used by online debuggers, program tracers (dynamic tracing); More >> >> generally, it provides type introspection for C programs, with an optional >> >> library API to allow them to get at their own types quite more easily than >> >> DWARF. So, the umbrella usecases are - all C programs that want to introspect >> >> their own types quickly; and applications that want to introspect other >> >> programs's types quickly. >> > >> > What makes it superior to DWARF stripped down to the above feature set? >> >> Increased compactness. > > Does CTF support something like -fasynchronous-unwind-tables? You need > that to have any sane debugging on many platforms. Without it, you > even have only partial backtraces, on most architectures/ABIs anyway. The backtrace section is still being designed, so it could! There is certainly nothing intrinsically preventing it. Am I right that this stuff works by ensuring that the arg->location picture is consistent at all times, between every instruction, rather than just at function calls, i.e. tracking all register moves done by the compiler, even transiently? Because that sounds doable, given that the compiler is doing the hard work of identifying such locations anyway (it has to for DWARF -fasynchronous-unwind-tables, right?). It seems essential to do this in any case if you want to get correct args for the function the user is actually stopped at: there's no requirement that the user is stopped at a function call!
On Thu, Jul 11, 2019 at 01:25:18PM +0100, Nix wrote: > On 10 Jul 2019, Segher Boessenkool spake thusly: > > > On Fri, Jul 05, 2019 at 07:28:12PM +0100, Nix wrote: > >> On 5 Jul 2019, Richard Biener said: > >> > >> > On Fri, Jul 5, 2019 at 12:21 AM Indu Bhagat <indu.bhagat@oracle.com> wrote: > >> >> CTF, at this time, is type information for entities at global or file scope. > >> >> This can be used by online debuggers, program tracers (dynamic tracing); More > >> >> generally, it provides type introspection for C programs, with an optional > >> >> library API to allow them to get at their own types quite more easily than > >> >> DWARF. So, the umbrella usecases are - all C programs that want to introspect > >> >> their own types quickly; and applications that want to introspect other > >> >> programs's types quickly. > >> > > >> > What makes it superior to DWARF stripped down to the above feature set? > >> > >> Increased compactness. > > > > Does CTF support something like -fasynchronous-unwind-tables? You need > > that to have any sane debugging on many platforms. Without it, you > > even have only partial backtraces, on most architectures/ABIs anyway. > > The backtrace section is still being designed, so it could! There is > certainly nothing intrinsically preventing it. Am I right that this > stuff works by ensuring that the arg->location picture is consistent at > all times, between every instruction, rather than just at function > calls, i.e. tracking all register moves done by the compiler, even > transiently? Yes, something like that. You get unwind tables that are valid at each instruction boundary. This is esp. important for the return address, without that backtraces are broken. > Because that sounds doable, given that the compiler is > doing the hard work of identifying such locations anyway (it has to for > DWARF -fasynchronous-unwind-tables, right?). Yes, every backend outputs DWARF info semi-manually for this. You have some work to do if you want to use this for CTF. > It seems essential to do this in any case if you want to get correct > args for the function the user is actually stopped at: there's no > requirement that the user is stopped at a function call! Yes. You need the asynchronous option only if you need this info at any possible point in a program -- but quite a few things do need it everywhere ;-) Segher