diff mbox

[gomp] Move openacc vector& worker single handling to RTL

Message ID 87bnf9v5ma.fsf@kepler.schwinge.homeip.net
State New
Headers show

Commit Message

Thomas Schwinge July 18, 2015, 3:37 p.m. UTC
Hi Nathan!

On Thu, 09 Jul 2015 20:25:22 -0400, Nathan Sidwell <nathan@acm.org> wrote:
> This is the patch I committed.  [...]

Prompted by your recent "-O0 patch" to »[f]ix PTX worker spill/fill«, I
used the attached patch 0001-O0-libgomp-C-C-testing.patch to run all C
and C++ libgomp testing with -O0 (for Fortran, we iterate through various
kinds of optimization levels anyway).  (There are no regressions of
OpenMP testing.)  

For OpenACC nvptx offloading, there must still be something wrong; here's
a count of the (non-deterministic!) regressions of ten runs of the
libgomp testsuite.  As private-vars-loop-worker-5.c fails most often, it
probably makes sense to look into that one first.

For avoidance of doubt, there are no such regressions if I un-apply your
patch to »[m]ove openacc vector& worker single handling to RTL«.

libgomp.oacc-c:

    3: [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/private-vars-local-worker-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    4: [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/private-vars-local-worker-2.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    3: [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/private-vars-local-worker-3.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    5: [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/private-vars-local-worker-4.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    4: [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/private-vars-local-worker-5.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    3: [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/private-vars-loop-vector-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    2: [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/private-vars-loop-vector-2.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    3: [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/private-vars-loop-worker-2.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    2: [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/private-vars-loop-worker-3.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    2: [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/private-vars-loop-worker-4.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    8: [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/private-vars-loop-worker-5.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    4: [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/private-vars-loop-worker-6.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    4: [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/private-vars-loop-worker-7.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    1: [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/worker-partn-5.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    3: [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/worker-partn-6.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test

libgomp.oacc-c++:

    5: [-PASS:-]{+FAIL:+} libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-local-worker-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    5: [-PASS:-]{+FAIL:+} libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-local-worker-2.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    4: [-PASS:-]{+FAIL:+} libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-local-worker-3.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    5: [-PASS:-]{+FAIL:+} libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-local-worker-4.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    6: [-PASS:-]{+FAIL:+} libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-local-worker-5.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    3: [-PASS:-]{+FAIL:+} libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-loop-vector-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    2: [-PASS:-]{+FAIL:+} libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-loop-worker-2.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    4: [-PASS:-]{+FAIL:+} libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-loop-worker-3.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    4: [-PASS:-]{+FAIL:+} libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-loop-worker-4.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    7: [-PASS:-]{+FAIL:+} libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-loop-worker-5.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    4: [-PASS:-]{+FAIL:+} libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-loop-worker-6.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    5: [-PASS:-]{+FAIL:+} libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-loop-worker-7.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test
    1: [-PASS:-]{+FAIL:+} libgomp.oacc-c++/../libgomp.oacc-c-c++-common/worker-partn-6.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 execution test


Grüße,
 Thomas

Comments

Nathan Sidwell July 20, 2015, 1:01 p.m. UTC | #1
On 07/18/15 11:37, Thomas Schwinge wrote:
> Hi Nathan!

> For OpenACC nvptx offloading, there must still be something wrong; here's
> a count of the (non-deterministic!) regressions of ten runs of the
> libgomp testsuite.  As private-vars-loop-worker-5.c fails most often, it
> probably makes sense to look into that one first.

I'll take a look. :(

nathan
Nathan Sidwell July 20, 2015, 3:08 p.m. UTC | #2
On 07/20/15 09:01, Nathan Sidwell wrote:
> On 07/18/15 11:37, Thomas Schwinge wrote:
>> Hi Nathan!
>
>> For OpenACC nvptx offloading, there must still be something wrong; here's
>> a count of the (non-deterministic!) regressions of ten runs of the
>> libgomp testsuite.  As private-vars-loop-worker-5.c fails most often, it
>> probably makes sense to look into that one first.
>
> I'll take a look. :(

Having difficulty reproducing it (preprocessed source compiled at -O0 works for 
me).  Do you have an exact recipe?


nathan
Nathan Sidwell July 21, 2015, 8:05 p.m. UTC | #3
On 07/18/15 11:37, Thomas Schwinge wrote:
> Hi Nathan!
>
> On Thu, 09 Jul 2015 20:25:22 -0400, Nathan Sidwell <nathan@acm.org> wrote:
>> This is the patch I committed.  [...]
>
> Prompted by your recent "-O0 patch" to »[f]ix PTX worker spill/fill«, I
> used the attached patch 0001-O0-libgomp-C-C-testing.patch to run all C
> and C++ libgomp testing with -O0 (for Fortran, we iterate through various
> kinds of optimization levels anyway).  (There are no regressions of
> OpenMP testing.)
>
> For OpenACC nvptx offloading, there must still be something wrong; here's
> a count of the (non-deterministic!) regressions of ten runs of the
> libgomp testsuite.  As private-vars-loop-worker-5.c fails most often, it
> probably makes sense to look into that one first.
>
> For avoidance of doubt, there are no such regressions if I un-apply your
> patch to »[m]ove openacc vector& worker single handling to RTL«.

I cannot reproduce the failures.  Applying your patch I see the following new fails:

FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/lib-5.c 
-DACC_DEVICE_TYPE_host_nonshm=1 -DACC_MEM_SHARED=0 execution test
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/private-vars-local-worker-3.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 e
xecution test
FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/private-vars-loop-worker-7.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 ex
ecution test
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/present-1.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 output pattern te
st, is , should match present clause: !acc_is_present
FAIL: 
libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-local-worker-2.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0
  execution test
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-loop-vector-1.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0
execution test
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-loop-worker-4.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0
execution test
FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-loop-worker-5.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0
execution test

Which differs from your list.  Attempting to reproduce outside the test suite 
results in working executables.

nathan
Thomas Schwinge July 22, 2015, 7:43 a.m. UTC | #4
Hi Nathan!

On Tue, 21 Jul 2015 16:05:05 -0400, Nathan Sidwell <nathan@codesourcery.com> wrote:
> On 07/18/15 11:37, Thomas Schwinge wrote:
> > On Thu, 09 Jul 2015 20:25:22 -0400, Nathan Sidwell <nathan@acm.org> wrote:
> >> This is the patch I committed.  [...]
> >
> > Prompted by your recent "-O0 patch" to »[f]ix PTX worker spill/fill«, I
> > used the attached patch 0001-O0-libgomp-C-C-testing.patch to run all C
> > and C++ libgomp testing with -O0 (for Fortran, we iterate through various
> > kinds of optimization levels anyway).  (There are no regressions of
> > OpenMP testing.)
> >
> > For OpenACC nvptx offloading, there must still be something wrong; here's
> > a count of the (non-deterministic!) regressions of ten runs of the
> > libgomp testsuite.  As private-vars-loop-worker-5.c fails most often, it
> > probably makes sense to look into that one first.
> >
> > For avoidance of doubt, there are no such regressions if I un-apply your
> > patch to »[m]ove openacc vector& worker single handling to RTL«.
> 
> I cannot reproduce the failures.  Applying your patch I see the following new fails:
> 
> FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/lib-5.c 
> -DACC_DEVICE_TYPE_host_nonshm=1 -DACC_MEM_SHARED=0 execution test
> FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/private-vars-local-worker-3.c 
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 e
> xecution test
> FAIL: libgomp.oacc-c/../libgomp.oacc-c-c++-common/private-vars-loop-worker-7.c 
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 ex
> ecution test
> FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/present-1.c 
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 output pattern te
> st, is , should match present clause: !acc_is_present
> FAIL: 
> libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-local-worker-2.c 
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0
>   execution test
> FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-loop-vector-1.c 
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0
> execution test
> FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-loop-worker-4.c 
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0
> execution test
> FAIL: libgomp.oacc-c++/../libgomp.oacc-c-c++-common/private-vars-loop-worker-5.c 
> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0
> execution test
> 
> Which differs from your list.

Well, then instead look into one of these (the private-vars-* ones)?  :-)
(Still hoping they're all caused by the same problem.)

> Attempting to reproduce outside the test suite 
> results in working executables.

Have you tried running it multiple times?  As I said, it's
non-deterministic.

Taking from libgomp.log the compile command line of
private-vars-loop-worker-5.c for »-DACC_DEVICE_TYPE_nvidia=1«, removing
the constructor.o stuff, replacing »-L« by »{-L,-Wl\,-rpath\,}«, and
adding »-O0« at the end, I then see the following:

    $ while :; do ./private-vars-loop-worker-5.exe 2> /dev/null && echo -n .; done
    ...Aborted (core dumped)
    .........Aborted (core dumped)
    ........Aborted (core dumped)
    ....Aborted (core dumped)
    .Aborted (core dumped)
    ...........Aborted (core dumped)
    ........Aborted (core dumped)
    Aborted (core dumped)
    .Aborted (core dumped)
    ...Aborted (core dumped)
    [...]


Grüße,
 Thomas
diff mbox

Patch

From a527ce3bcb60a4dbd8feb579dd90688b33760d78 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Fri, 17 Jul 2015 15:24:19 +0200
Subject: [PATCH] -O0 libgomp C, C++ testing

---
 libgomp/testsuite/libgomp.c++/c++.exp      | 1 +
 libgomp/testsuite/libgomp.c/c.exp          | 1 +
 libgomp/testsuite/libgomp.oacc-c++/c++.exp | 1 +
 libgomp/testsuite/libgomp.oacc-c/c.exp     | 1 +
 4 files changed, 4 insertions(+)

diff --git a/libgomp/testsuite/libgomp.c++/c++.exp b/libgomp/testsuite/libgomp.c++/c++.exp
index d6d525a..6bdb83d 100644
--- a/libgomp/testsuite/libgomp.c++/c++.exp
+++ b/libgomp/testsuite/libgomp.c++/c++.exp
@@ -16,6 +16,7 @@  if [info exists lang_include_flags] then {
 if ![info exists DEFAULT_CFLAGS] then {
     set DEFAULT_CFLAGS "-O2"
 }
+set DEFAULT_CFLAGS "-O0"
 
 # Initialize dg.
 dg-init
diff --git a/libgomp/testsuite/libgomp.c/c.exp b/libgomp/testsuite/libgomp.c/c.exp
index 25f347b..f89377f 100644
--- a/libgomp/testsuite/libgomp.c/c.exp
+++ b/libgomp/testsuite/libgomp.c/c.exp
@@ -16,6 +16,7 @@  load_gcc_lib gcc-dg.exp
 if ![info exists DEFAULT_CFLAGS] then {
     set DEFAULT_CFLAGS "-O2"
 }
+set DEFAULT_CFLAGS "-O0"
 
 # Initialize dg.
 dg-init
diff --git a/libgomp/testsuite/libgomp.oacc-c++/c++.exp b/libgomp/testsuite/libgomp.oacc-c++/c++.exp
index 7309f78..4dba472 100644
--- a/libgomp/testsuite/libgomp.oacc-c++/c++.exp
+++ b/libgomp/testsuite/libgomp.oacc-c++/c++.exp
@@ -18,6 +18,7 @@  if [info exists lang_include_flags] then {
 if ![info exists DEFAULT_CFLAGS] then {
     set DEFAULT_CFLAGS "-O2"
 }
+set DEFAULT_CFLAGS "-O0"
 
 # Initialize dg.
 dg-init
diff --git a/libgomp/testsuite/libgomp.oacc-c/c.exp b/libgomp/testsuite/libgomp.oacc-c/c.exp
index 60be15d..80b4635 100644
--- a/libgomp/testsuite/libgomp.oacc-c/c.exp
+++ b/libgomp/testsuite/libgomp.oacc-c/c.exp
@@ -18,6 +18,7 @@  load_gcc_lib gcc-dg.exp
 if ![info exists DEFAULT_CFLAGS] then {
     set DEFAULT_CFLAGS "-O2"
 }
+set DEFAULT_CFLAGS "-O0"
 
 # Initialize dg.
 dg-init
-- 
2.1.4