Message ID | 1553974e-f8ee-4b4c-9886-692d7328de87@baylibre.com |
---|---|
State | New |
Headers | show |
Series | invoke.texi: Add note that -foffload= does not affect device detection | expand |
On 3/1/24 08:23, Tobias Burnus wrote: > Not very often, but do I keep running into issues (fails, segfaults) > related to testing programs compiled with a GCC without offload > configured and then using the system libraries. - That's equivalent > to having the system compiler (or any offload compiler) and > compiling with -foffload=disable. > > The problem is that while the program only contains host code, > the run-time library still initializes devices when an API > routine - such as omp_get_num_devices - is invoked. This can > lead to odd bugs as target regions, obviously, will use host > fallback (for any device number) but the API routines will > happily operate on the actual devices, which can lead to odd > errors. > > (Likewise issue when compiling for one offload target type > and running on a system which has devices of an other type.) > > I assume that that's not a very common problem, but it can be > rather confusing when hitting this issue. > > Maybe the proposed wording will help others to avoid this pitfall. > (Or is this superfluous as -foffload= is not much used and, even if, > no one then remembers or finds this none?) > > Thoughts? Well, I spent a long time looking at this, and my only conclusion is that I don't really understand what the problem you're trying to solve is. If it's problematical to have the runtime know about offload devices the compiled code isn't using, don't users also need to know how to restrict the runtime to a particular set of devices the same way -foffload= lets you do, and not just how to disable offloading in the runtime entirely? It's pretty clearly documented already how -foffload affects the compiler's behavior, and the library's behavior is already documented in its own manual. Maybe what we don't have is a tutorial on how to build/link/run programs using a specific offload device, or on the host? Anyway, I don't really object to the text you want to add, but it makes me more confused instead of less so. :-S > > * * * > > It was not clear to me how to refer to libgomp.texi > - Should it be 'libgomp' as in 'info libgomp' or the URL > https://gcc.gnu.org/onlinedocs/libgomp/ (or filename of the PDF) > implies? > - Or as 'GNU Offloading and Multi Processing Runtime Library Manual' > as named linked to at https://gcc.gnu.org/onlinedocs or on the title > page > of the the PDF - but that name is not repeated in the info file or > the HTML > file. > - Or even 'GNU libgomp' to mirror a substring in the <title> of the HTML > file. > I now ended up only implicitly referring that document. The Texinfo input file has "@settitle GNU libgomp". > Aside: Shouldn't all the HTML documents start with a <h1> and <title> > before > the table of content? Currently, it has: > <title>Top (GNU libgomp)</title> > and the body starts with > <h2>Short Table of Contents</h2> I think this is a bug in the version of texinfo used to produce the HTML content for the GCC web site. Looking at a recent build of my own using Texinfo 6.7, I do see <body lang="en"> <h1 class="settitle" align="center">GNU libgomp</h1> The manual on the web site says it was produced by "GNU Texinfo 7.0dev". -Sandra
On 3/1/24 17:29, Sandra Loosemore wrote: > On 3/1/24 08:23, Tobias Burnus wrote: >> Aside: Shouldn't all the HTML documents start with a <h1> and <title> >> before >> the table of content? Currently, it has: >> <title>Top (GNU libgomp)</title> >> and the body starts with >> <h2>Short Table of Contents</h2> > > I think this is a bug in the version of texinfo used to produce the HTML > content for the GCC web site. Looking at a recent build of my own using > Texinfo 6.7, I do see > > <body lang="en"> > <h1 class="settitle" align="center">GNU libgomp</h1> > > The manual on the web site says it was produced by "GNU Texinfo 7.0dev". I poked at this a little and apparently you need to fiddle with the SHOW_TITLE or NO_TOP_NODE_OUTPUT customization variables in recent versions of Texinfo in order to get the document title to show up in HTML output. https://www.gnu.org/software/texinfo/manual/texinfo/texinfo.html#index-SHOW_005fTITLE Probably this has to be controlled by a configure check since older Texinfo versions may barf on unknown options. I'm not at a good point to fiddle with this myself right now (I'm deep inside more metadirective/declare variant hacking), also I have no idea how to re-do the HTML manuals linked from the GCC web site to tweak the formatting in this way. I'd think that if we were going to do that, we'd also want to use an official release version of Texinfo instead of a "dev" snapshot. -Sandra
Hi, Sandra Loosemore wrote: > On 3/1/24 17:29, Sandra Loosemore wrote: >> On 3/1/24 08:23, Tobias Burnus wrote: >>> Aside: Shouldn't all the HTML documents start with a <h1> and >>> <title> before >>> the table of content? Currently, it has: >>> <title>Top (GNU libgomp)</title> >>> and the body starts with >>> <h2>Short Table of Contents</h2> I note that the 'Top(...)' in <title> already appears in the GCC 8.5 docs (created with Texinfo 6.5; while GCC 7.5, created with texinfo 6.3, is okay). And the <h1> disappears in the GCC 10.5 doc, created with Texinfo 7.0dev. I have no idea why the 'Top(...)' appears with Texinfo 6.5, but the missing <h1> is because of Texinfo 7.0, cf. https://git.savannah.gnu.org/cgit/texinfo.git/plain/NEWS I think it would be useful to remove the 'Top()' in <title> and add the <h1> in general. For the GCC website, we might want to set TOP_NODE_UP_URL. >> I think this is a bug in the version of texinfo used to produce the >> HTML content for the GCC web site. Looking at a recent build of my >> own using Texinfo 6.7, I do see >> >> <body lang="en"> >> <h1 class="settitle" align="center">GNU libgomp</h1> >> >> The manual on the web site says it was produced by "GNU Texinfo 7.0dev". > > I poked at this a little and apparently you need to fiddle with the > SHOW_TITLE or NO_TOP_NODE_OUTPUT customization variables in recent > versions of Texinfo in order to get the document title to show up in > HTML output. > > https://www.gnu.org/software/texinfo/manual/texinfo/texinfo.html#index-SHOW_005fTITLE > > > Probably this has to be controlled by a configure check since older > Texinfo versions may barf on unknown options. ... > I'd think that if we were going to do that, we'd also want to use an > official release version of Texinfo instead of a "dev" snapshot. (I concur that we should update 7.0dev to 7.0.3 or 7.1 on the server to have a defined version.) Thanks, Tobias
Hi Sandra, Sandra Loosemore wrote: > On 3/1/24 08:23, Tobias Burnus wrote: >> Maybe the proposed wording will help others to avoid this pitfall. >> (Or is this superfluous as -foffload= is not much used and, even if, >> no one then remembers or finds this none?) > > Well, I spent a long time looking at this, and my only conclusion is > that I don't really understand what the problem you're trying to solve > is. If it's problematical to have the runtime know about offload > devices the compiled code isn't using, don't users also need to know > how to restrict the runtime to a particular set of devices the same > way -foffload= lets you do, and not just how to disable offloading in > the runtime entirely? > It's pretty clearly documented already how -foffload affects the > compiler's behavior, and the library's behavior is already documented > in its own manual. Maybe what we don't have is a tutorial on how to > build/link/run programs using a specific offload device, or on the host? The problem is for code like the following, which is perfectly valid and works (A) If you don't have any offload device (independent of the compiler options) (B) If you have an offload device (supported by your libgomp) and compiled with offloading support (for that device) But (C) if you have an offload device and compile as: gcc -fopenmp -foffload=disabled it will fail at runtime with: dev = 0 / num devs = 1 Segmentation fault (core dumped) The problem is that there is a mismatch between the code (assumes no offload code + always host fallback) and the run-time library (which detects offload devices), such that the API routines uses a different device than the 'target' code: -------------------- #include <omp.h> #include <stdio.h> #define N 2064 int main () { int *x = (int*) omp_target_alloc (sizeof(int)*N, omp_get_default_device ()); printf ("dev = %d / num devs = %d\n", omp_get_default_device (), omp_get_num_devices ()); #pragma omp target is_device_ptr(x) for (int i = 0; i < N; ++i) x[i] = i; } ------------------- On the technical side, it is not really surprising but it might be still be confusing for the user. Obviously, it can also occur if you compile, e.g., for AMD GCN and only an Nvidia device is available - but there the solution would be the same (disable all devices). (OpenMP 6.0 will provide a environment variable that allows fine tuning of the available devices.) Questions: * Is such a usage common enough to matter? I guess for some benchmark use it make – to test whether real offloading or host fallback is faster + if the latter is true, it might also get used in operational code. * Are API routines used in such a code in a way that it breaks? (Unfortunately not very unlikely in larger code.) If there is enough real-world usage (= 2x yes to the questions above): * How to word is to help users and not to confuse them? Tobias
invoke.texi: Add note that -foffload= does not affect device detection gcc/ChangeLog: * doc/invoke.texi (-foffload): Add note that the flag does not affect whether offload devices are detected. gcc/doc/invoke.texi | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index dc5fd863ca4..4153863020b 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -2736,38 +2736,45 @@ targets using ms-abi. @opindex foffload @cindex Offloading targets @cindex OpenACC offloading targets @cindex OpenMP offloading targets @item -foffload=disable @itemx -foffload=default @itemx -foffload=@var{target-list} Specify for which OpenMP and OpenACC offload targets code should be generated. The default behavior, equivalent to @option{-foffload=default}, is to generate code for all supported offload targets. The @option{-foffload=disable} form generates code only for the host fallback, while @option{-foffload=@var{target-list}} generates code only for the specified comma-separated list of offload targets. Offload targets are specified in GCC's internal target-triplet format. You can run the compiler with @option{-v} to show the list of configured offload targets under @code{OFFLOAD_TARGET_NAMES}. +Note that this option does not affect the available offload devices detected by +the run-time library and, hence, the values returned by the OpenMP/OpenACC API +routines or access to devices using those routines. The run-time library +itself can be tuned using environment variables; in particular, to fully disable +the device detection, set the @code{OMP_TARGET_OFFLOAD} environment variable to +@code{disabled}. + @opindex foffload-options @cindex Offloading options @cindex OpenACC offloading options @cindex OpenMP offloading options @item -foffload-options=@var{options} @itemx -foffload-options=@var{target-triplet-list}=@var{options} With @option{-foffload-options=@var{options}}, GCC passes the specified @var{options} to the compilers for all enabled offloading targets. You can specify options that apply only to a specific target or targets by using the @option{-foffload-options=@var{target-list}=@var{options}} form. The @var{target-list} is a comma-separated list in the same format as for the @option{-foffload=} option. Typical command lines are @smallexample -foffload-options='-fno-math-errno -ffinite-math-only' -foffload-options=nvptx-none=-latomic -foffload-options=amdgcn-amdhsa=-march=gfx906