Message ID | yddlhknjc4e.fsf@lokon.CeBiTec.Uni-Bielefeld.DE |
---|---|
State | New |
Headers | show |
On Jan 28, 2015, at 4:58 AM, Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> wrote: > > Thoughts? So the timeout for slow things can be increased: # More time is needed set_board_info gcc,timeout 800 set_board_info gdb,timeout 60 from the frv-sim.exp file. And: # Increase the timeout set timeout 60 from a few others. And: set_board_info testcase_timeout 30 from usparc-cygmon.exp. I’d like to think one or more of those would cure the timeout problem. You can put the timeout in a site.exp file or a $HOME/.dejgnurc file for dejagnu and try it out, something like: case "$target_triplet" in { { “sparc-*-*" } { set_board_info gdb,timeout 60 } } if you want to try site.exp. The timeouts should be buried somewhere inside gcc or dejagnu for non-site specific timeouts however. For individual test cases that are pigs that you catch, dg-timeout-factor can be used to slide it up. On a good host, every test should take 10 seconds or less. If they take more, I would bump the timeout to be 30x the time it takes on a good host. This gives us headroom for shared machines, slower machines and all sorts of variations. The slowest of the targets (m68k testing), aren’t expected to be able to be covered by this slop, they would need to use a more formal mechanism to say they are extra slow. Upping the timeout of course won’t cure an infinite loop due to gdb bugs. One thing that I wonder about is how much swap space one has and if the test suite firing up too many gdb tasks too fast and simply running it out of memory or causing thrashing due to testtcase size. If you wanted to explore this, look for check_gcc_parallelize=10000 in the Makefile, and see if trimming it down helps. If so, then Jakub might be able to help trim the guilty tests down so they don’t so thrash as much. > Ok for mainline? Ok to me, but, if the gdb/guality folks have better comments, I’d defer to them. The command line issue sounds like a gdb bug. I’d report it and tag in a PR number into the comment where we create the command line.
On 01/28/15 05:58, Rainer Orth wrote: > Since the testsuite parallelism has been massively increased some time > ago, I'm seing lots of timeouts on slower SPARC hardware (1.2 Ghz > UltraSPARC-T2). Closer investigation revealed that this happens on > Solaris everywhere, though not so badly that the testsuite 300 second > timeout hits. The check_guality test in gcc.dg/guality and > g++.dg/guality times out every time. > > It turns out that while the gfortran.dg/guality test do run, the gcc.dg > and g++.dg ones don't. Running check_guality under truss shows that gdb > complains > > No symbol table is loaded. Use the "file" command. > > and loops from there, unlike gfortran.dg where the command name is > passed to gdb. No idea why this doesn't happen on Linux, but the > problem is easily cured by adding the command name (which is the actual > executable gdb tries to attach to) to the gdb command line. > > This makes the guality tests run sucessfully on Solaris, even on the > slow T2 box. > > However, the guality tests show dozens or even hundreds of FAILs and/or > XPASSes, adding insane amounts of noise to the testsuite results, which > nobody seems to be looking into. So I wonder what the best course of > action is here: one might consider running them only when a > GCC_TEST_RUN_GUALITY environment variable is set, but skip them by > default until someone acutally interested in improving results here > shows up. > > Thoughts? > > Anyway, here's the patch I've used. > > Ok for mainline? > > Rainer > > > 2015-01-28 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> > > gcc/testsuite: > * gcc.dg/guality/guality.h (main): Add argv[0] to > guality_gdb_command. OK. As for what to do with guality, I haven't a clue. They're dependent on the debugger version and perhaps other stuff that I don't recall. Perhaps skip them if we find gdb and determine it is "too old"? jeff
On Wed, Jan 28, 2015 at 01:42:47PM -0700, Jeff Law wrote: > >2015-01-28 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> > > > > gcc/testsuite: > > * gcc.dg/guality/guality.h (main): Add argv[0] to > > guality_gdb_command. > OK. > > As for what to do with guality, I haven't a clue. They're dependent on the > debugger version and perhaps other stuff that I don't recall. > > Perhaps skip them if we find gdb and determine it is "too old"? We already do that. I bet the Solaris case is more about the lack of support to find an executable from its pid (/proc/<pid>/exe in Linux). Guess one can easily try it, run gdb (without arguments, or those -nx -nw that guality uses) and type attach 19355 # pick pid of some process you can debug in Linux gdb will find the binary etc. Jakub
Mike Stump <mikestump@comcast.net> writes: > On Jan 28, 2015, at 4:58 AM, Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> wrote: >> >> Thoughts? > > So the timeout for slow things can be increased: > > # More time is needed > set_board_info gcc,timeout 800 > set_board_info gdb,timeout 60 > > from the frv-sim.exp file. And: > > # Increase the timeout > set timeout 60 > > from a few others. And: > > set_board_info testcase_timeout 30 > > from usparc-cygmon.exp. I’d like to think one or more of those would cure the timeout problem. You can put the timeout in a site.exp file or a $HOME/.dejgnurc file for dejagnu and try it out, something like: > > case "$target_triplet" in { > { “sparc-*-*" } { > set_board_info gdb,timeout 60 > } > } > > if you want to try site.exp. The timeouts should be buried somewhere inside gcc or dejagnu for non-site specific timeouts however. I know: I've been using something like this for a long time while still testing gcc on a 250 MHz MIPS R10k running (crawling, actually ;-) IRIX. But that's not the point here: without passing the exec name to gdb, the check_guality test hangs forever, however large one might make the dejagnu timeout. > For individual test cases that are pigs that you catch, dg-timeout-factor can be used to slide it up. On a good host, every test should take 10 seconds or less. If they take more, I would bump the timeout to be 30x the time it takes on a good host. This gives us headroom for shared machines, slower machines and all sorts of variations. The slowest of the targets (m68k testing), aren’t expected to be able to be covered by this slop, they would need to use a more formal mechanism to say they are extra slow. I'll have to look into this some more when I'll be forced to to more Solaris/SPARC testing on slow UltraSPARC-T2 systems: there are a couple of tests that regularly run into the dejagnu timeout, some already reported, others not yet. > Upping the timeout of course won’t cure an infinite loop due to gdb bugs. Indeed. > One thing that I wonder about is how much swap space one has and if the test suite firing up too many gdb tasks too fast and simply running it out of memory or causing thrashing due to testtcase size. If you wanted to explore this, look for check_gcc_parallelize=10000 in the Makefile, and see if trimming it down helps. If so, then Jakub might be able to help trim the guilty tests down so they don’t so thrash as much. I don't think this is an issue: one I'd fixed guality.h to pass argv[0] to gdb, all guality tests completed just fine, even on my slowest SPARC machine. >>> Ok for mainline? > > Ok to me, but, if the gdb/guality folks have better comments, I’d defer to them. Indeed: this part of the question is more about the general quality and future of the guality tests. They have been Alexandre's baby mostly, and he's been largely absent from GCC development lately. > The command line issue sounds like a gdb bug. I’d report it and tag in a PR number into the comment where we create the command line. Depends: one some platforms, it might be possible to determine the exec name by OS-dependent means (be it /proc, getexecname, or whatever), but I bet there are targets supported by gdb that have no such facility, and for their sake (and currently available versions of gdb), just always passing argv[0] seems the easiest course of action. As I said, gfortran.dg/guality/guality.exp already does it, and there were no issues even on Solaris. Rainer
On Wed, Jan 28, 2015 at 10:10:18PM +0100, Rainer Orth wrote: > passing argv[0] seems the easiest course of action. As I said, > gfortran.dg/guality/guality.exp already does it, and there were no > issues even on Solaris. gfortran.dg/guality/guality.exp doesn't do that. The thing is, there are 2 kinds of guality tests, the C/C++ ones using guality.h, and then ones using gcc-gdb-test.exp, where the spawning of gdb is done in tcl. Those share just directories, nothing else, and gfortran.dg/guality for obvious reasons doesn't contain any such tests. Jakub
Jakub Jelinek <jakub@redhat.com> writes: > On Wed, Jan 28, 2015 at 10:10:18PM +0100, Rainer Orth wrote: >> passing argv[0] seems the easiest course of action. As I said, >> gfortran.dg/guality/guality.exp already does it, and there were no >> issues even on Solaris. > > gfortran.dg/guality/guality.exp doesn't do that. > The thing is, there are 2 kinds of guality tests, the C/C++ ones using > guality.h, and then ones using gcc-gdb-test.exp, where the spawning of > gdb is done in tcl. > Those share just directories, nothing else, and gfortran.dg/guality > for obvious reasons doesn't contain any such tests. You're right, of course: I looked at lib/gcc-gdb-test.exp (gdb-test), where gdb is called with the executable name, but there's no attach to an already running process involved, so it cannot be done any other way. Rainer
Jakub Jelinek <jakub@redhat.com> writes: > On Wed, Jan 28, 2015 at 01:42:47PM -0700, Jeff Law wrote: >> >2015-01-28 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> >> > >> > gcc/testsuite: >> > * gcc.dg/guality/guality.h (main): Add argv[0] to >> > guality_gdb_command. >> OK. >> >> As for what to do with guality, I haven't a clue. They're dependent on the >> debugger version and perhaps other stuff that I don't recall. >> >> Perhaps skip them if we find gdb and determine it is "too old"? > > We already do that. I bet the Solaris case is more about the lack > of support to find an executable from its pid (/proc/<pid>/exe > in Linux). > Guess one can easily try it, run > gdb > (without arguments, or those -nx -nw that guality uses) and type > attach 19355 # pick pid of some process you can debug > in Linux gdb will find the binary etc. That issue is easily solved by passing the executable name to gdb; this is guaranteed to work everywhere. On Solaris (at least from Solaris 10 onwards, haven't checked earlier version), gdb could use /proc/<pid>/path/a.out to get at the executable, but that won't help for released versions of gdb (and eventually other platforms which provide no such facility). But this issue is minor and easily avoided. The major problem is that on both Solaris and Linux, many of the guality tests FAIL (or XPASS, equally adding noise to mail-reports.log) even with a current version of gdb (7.8 in my case): Linux/x86_64 Solaris 11/x86 Solaris 11/SPARC (Fedora 20) gcc.dg/guality: # of expected passes 6490 6500 5489 # of unexpected failures 191 171 802 # of unexpected successes 61 66 73 # of expected failures 35 30 23 # of unsupported tests 257 267 383 g++.dg/guality: # of expected passes 128 128 118 # of unexpected failures 6 10 10 # of unsupported tests 34 30 40 It also seems (haven't checked yet in detail) that the results also depend on whether they are created as part of a regular make check at the toplevel, compared to runtest --tool <tool> guality.exp. Judging from posted testresults, I'm no the only one seeing this, and the guality tests have way more FAILs than all other tests combined: with those amounts of noise, it's almost impossile to see other errors, and nobody seems to work on fixing those. Thus my suggestion not run them by default until someone steps forward to take care of all those issues. Rainer
Jakub Jelinek <jakub@redhat.com> writes: > On Wed, Jan 28, 2015 at 01:42:47PM -0700, Jeff Law wrote: >> >2015-01-28 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> >> > >> > gcc/testsuite: >> > * gcc.dg/guality/guality.h (main): Add argv[0] to >> > guality_gdb_command. >> OK. >> >> As for what to do with guality, I haven't a clue. They're dependent on the >> debugger version and perhaps other stuff that I don't recall. >> >> Perhaps skip them if we find gdb and determine it is "too old"? > > We already do that. I bet the Solaris case is more about the lack > of support to find an executable from its pid (/proc/<pid>/exe > in Linux). > Guess one can easily try it, run > gdb > (without arguments, or those -nx -nw that guality uses) and type > attach 19355 # pick pid of some process you can debug > in Linux gdb will find the binary etc. I've now filed GDB PR tdep/17903 for this, outlining a possible implementation of this feature on Solaris. Rainer
On 01/28/15 15:16, Rainer Orth wrote: > Jakub Jelinek <jakub@redhat.com> writes: > >> On Wed, Jan 28, 2015 at 01:42:47PM -0700, Jeff Law wrote: >>>> 2015-01-28 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE> >>>> >>>> gcc/testsuite: >>>> * gcc.dg/guality/guality.h (main): Add argv[0] to >>>> guality_gdb_command. >>> OK. >>> >>> As for what to do with guality, I haven't a clue. They're dependent on the >>> debugger version and perhaps other stuff that I don't recall. >>> >>> Perhaps skip them if we find gdb and determine it is "too old"? >> >> We already do that. I bet the Solaris case is more about the lack >> of support to find an executable from its pid (/proc/<pid>/exe >> in Linux). >> Guess one can easily try it, run >> gdb >> (without arguments, or those -nx -nw that guality uses) and type >> attach 19355 # pick pid of some process you can debug >> in Linux gdb will find the binary etc. > > That issue is easily solved by passing the executable name to gdb; this > is guaranteed to work everywhere. On Solaris (at least from Solaris 10 > onwards, haven't checked earlier version), gdb could use > /proc/<pid>/path/a.out to get at the executable, but that won't help for > released versions of gdb (and eventually other platforms which provide > no such facility). > > But this issue is minor and easily avoided. The major problem is that > on both Solaris and Linux, many of the guality tests FAIL (or XPASS, > equally adding noise to mail-reports.log) even with a current version of > gdb (7.8 in my case): > > > Linux/x86_64 Solaris 11/x86 Solaris 11/SPARC > (Fedora 20) > > gcc.dg/guality: > > # of expected passes 6490 6500 5489 > # of unexpected failures 191 171 802 > # of unexpected successes 61 66 73 > # of expected failures 35 30 23 > # of unsupported tests 257 267 383 > > g++.dg/guality: > > # of expected passes 128 128 118 > # of unexpected failures 6 10 10 > # of unsupported tests 34 30 40 Could we select a version of gdb as a reference version, then xfail those tests which don't work with that reference version on each platform to cut down the noise? Obviously we bump the reference version, probably as we close stage1 or stage3 development? It's not unusual to have different test results across x86 vs x86_64 vs sparc. What I'd be real interested to know is if x86 across linux and solaris give the same results, similarly for x86-64 across those two platforms. I realize it's a fair amount of work, but I'm always reluctant to simply disable a large number of tests. Jeff
On Thu, Jan 29, 2015 at 11:11:12PM -0700, Jeff Law wrote: > > Linux/x86_64 Solaris 11/x86 Solaris 11/SPARC > > (Fedora 20) > > > >gcc.dg/guality: > > > ># of expected passes 6490 6500 5489 > ># of unexpected failures 191 171 802 > ># of unexpected successes 61 66 73 > ># of expected failures 35 30 23 > ># of unsupported tests 257 267 383 > > > >g++.dg/guality: > > > ># of expected passes 128 128 118 > ># of unexpected failures 6 10 10 > ># of unsupported tests 34 30 40 > Could we select a version of gdb as a reference version, then xfail those > tests which don't work with that reference version on each platform to cut > down the noise? Obviously we bump the reference version, probably as we > close stage1 or stage3 development? The biggest problem is that what fails and what does not varries between targets and between optimization levels. Right now we have no way to xfail test XYZ for -Os on x86_64-linux and for -O2 and -O3 on i686-linux ia32, and the lists would become very large. Some tests in guality are xfaileded just in case, even when they actually XPASS on many targets. The way to look for regressions in the guality area, at least as I do it regularly, is just compare test_summary results. If we'd disable this by default, I'm sure our debug quality would sink very quickly. Jakub
Jakub Jelinek <jakub@redhat.com> writes: > On Thu, Jan 29, 2015 at 11:11:12PM -0700, Jeff Law wrote: >> > Linux/x86_64 Solaris 11/x86 Solaris >> > 11/SPARC >> > (Fedora 20) >> > >> >gcc.dg/guality: >> > >> ># of expected passes 6490 6500 5489 >> ># of unexpected failures 191 171 802 >> ># of unexpected successes 61 66 73 >> ># of expected failures 35 30 23 >> ># of unsupported tests 257 267 383 >> > >> >g++.dg/guality: >> > >> ># of expected passes 128 128 118 >> ># of unexpected failures 6 10 10 >> ># of unsupported tests 34 30 40 >> Could we select a version of gdb as a reference version, then xfail those >> tests which don't work with that reference version on each platform to cut >> down the noise? Obviously we bump the reference version, probably as we >> close stage1 or stage3 development? > > The biggest problem is that what fails and what does not varries between > targets and between optimization levels. Right now we have no way to xfail > test XYZ for -Os on x86_64-linux and for -O2 and -O3 on i686-linux ia32, and > the lists would become very large. Some tests in guality are xfaileded just > in case, even when they actually XPASS on many targets. Right: while we could add such a facility (the current xfail selector support is already beyond what vanilla DejaGnu provides), it would quickly become very unwieldly. The best we can do right now is to use dg-xfail-run-if more precisely for tests that fail to execute. For gdb-test failures, the only option would be to dg-skip-if the affected tests per target and optimization level; quite large a hammer in some cases. I've started looking at this, but it's incredibly tedious. > The way to look for regressions in the guality area, at least as I do it > regularly, is just compare test_summary results. > If we'd disable this by default, I'm sure our debug quality would sink very > quickly. I'm not sure: my impression is that nobody really cared about the guality tests in quite some time. Comparing test results might work resonably, looking over the output if the vast majority of it is just guality failures becomes infeasible. I may end up running with GUALITY_GDB_NAME=/bin/true ;-( Rainer
On 01/30/15 01:19, Jakub Jelinek wrote: > > The biggest problem is that what fails and what does not varries between > targets and between optimization levels. Right now we have no way to xfail > test XYZ for -Os on x86_64-linux and for -O2 and -O3 on i686-linux ia32, and > the lists would become very large. Some tests in guality are xfaileded just > in case, even when they actually XPASS on many targets. I thought we added that kind of capability a while back. There's still significant potential for them to get unwieldy. The hope would be that we'd have a set for x86, x86_64, aarch64, etc, but not have to do anything special for the OS. > > The way to look for regressions in the guality area, at least as I do it > regularly, is just compare test_summary results. > If we'd disable this by default, I'm sure our debug quality would sink very > quickly. Yup. But it'd still be nicer if our test runs were cleaner. jeff
Jeff Law <law@redhat.com> writes: > On 01/30/15 01:19, Jakub Jelinek wrote: >> >> The biggest problem is that what fails and what does not varries between >> targets and between optimization levels. Right now we have no way to xfail >> test XYZ for -Os on x86_64-linux and for -O2 and -O3 on i686-linux ia32, and >> the lists would become very large. Some tests in guality are xfaileded just >> in case, even when they actually XPASS on many targets. > I thought we added that kind of capability a while back. There's still > significant potential for them to get unwieldy. The hope would be that > we'd have a set for x86, x86_64, aarch64, etc, but not have to do anything > special for the OS. I fear this won't suffice: it certainly will depend on the debug format used, and even so there are differences between Linux/x86 and Solaris/x86, both using ELF and DWARF (perhaps a DWARF-4 vs. DWARF-2 difference?). And Darwin/x86 with Mach-O will certainly differ again (not currently noticeable since the guality tests are disabled there wholesale). >> The way to look for regressions in the guality area, at least as I do it >> regularly, is just compare test_summary results. >> If we'd disable this by default, I'm sure our debug quality would sink very >> quickly. > Yup. But it'd still be nicer if our test runs were cleaner. Very true. I wonder how best to go forward with filing PRs for the failures: one PR for failing test may be overkill, but it would require lots of analysis to group by failure with common cause. Rainer
# HG changeset patch # Parent 26a271b29df9b72e6fbdb554a6a8b5e00d94e5ee Skip guality tests on Solaris diff --git a/gcc/testsuite/g++.dg/guality/guality.exp b/gcc/testsuite/g++.dg/guality/guality.exp --- a/gcc/testsuite/g++.dg/guality/guality.exp +++ b/gcc/testsuite/g++.dg/guality/guality.exp @@ -5,7 +5,7 @@ load_lib gcc-gdb-test.exp # Disable on darwin until radr://7264615 is resolved. if { [istarget *-*-darwin*] } { - return + return } if { [istarget "powerpc-ibm-aix*"] } { diff --git a/gcc/testsuite/gcc.dg/guality/guality.exp b/gcc/testsuite/gcc.dg/guality/guality.exp --- a/gcc/testsuite/gcc.dg/guality/guality.exp +++ b/gcc/testsuite/gcc.dg/guality/guality.exp @@ -5,7 +5,7 @@ load_lib gcc-gdb-test.exp # Disable on darwin until radr://7264615 is resolved. if { [istarget *-*-darwin*] } { - return + return } if { [istarget "powerpc-ibm-aix*"] } { diff --git a/gcc/testsuite/gcc.dg/guality/guality.h b/gcc/testsuite/gcc.dg/guality/guality.h --- a/gcc/testsuite/gcc.dg/guality/guality.h +++ b/gcc/testsuite/gcc.dg/guality/guality.h @@ -228,6 +228,16 @@ main (int argc, char *argv[]) } } + if (argv[0]) + { + int len = strlen (guality_gdb_command) + 1 + strlen (argv[0]); + char *buf = (char *) __builtin_alloca (len); + strcpy (buf, guality_gdb_command); + strcat (buf, " "); + strcat (buf, argv[0]); + guality_gdb_command = buf; + } + for (i = 1; i < argc; i++) if (strcmp (argv[i], "--guality-skip") == 0) guality_skip = 1;