Message ID | a1bd8b21-c935-125f-0b9c-381f0ed19b5a@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | Fix string/tst-xbzero-opt if build with gcc head. | expand |
On Thu, Jul 12, 2018 at 10:22 AM, Stefan Liebler <stli@linux.ibm.com> wrote: > Fix string/tst-xbzero-opt is build with gcc head. ... > In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop in > prepare_test_buffer. Thus count_test_patterns does not find any of the > test_pattern. > > This patch introduces a compiler barrier just after filling the buffer. I think I understand why the call to swapcontext in prepare_test_buffer is not a sufficient compiler barrier, but I am not a fan of asm volatile ("" ::: "memory"), because I fully expect some future compiler to decide that there are no actual assembly instructions being inserted so the statement can be completely ignored. I would prefer us to find some kind of construct that actually does make externally-visible side effects depend on the contents of 'buf' in terms of the C abstract machine. zw
On 07/12/2018 06:42 PM, Zack Weinberg wrote: > On Thu, Jul 12, 2018 at 10:22 AM, Stefan Liebler <stli@linux.ibm.com> wrote: >> Fix string/tst-xbzero-opt is build with gcc head. > ... >> In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop in >> prepare_test_buffer. Thus count_test_patterns does not find any of the >> test_pattern. >> >> This patch introduces a compiler barrier just after filling the buffer. > > I think I understand why the call to swapcontext in > prepare_test_buffer is not a sufficient compiler barrier, but I am not > a fan of asm volatile ("" ::: "memory"), because I fully expect some > future compiler to decide that there are no actual assembly > instructions being inserted so the statement can be completely > ignored. I would prefer us to find some kind of construct that > actually does make externally-visible side effects depend on the > contents of 'buf' in terms of the C abstract machine. > > zw > Okay. Then here is a new version of the patch without the empty asm. If build with GCC head on s390x, setup_no_clear is now filling the buffer and setup_ordinary_clear is just a tail call to setup_no_clear (same behaviour as before). Is this C construct okay? Bye. Stefan commit a42ce495317df04f1abdc3670c1fcbb68f329c3d Author: Stefan Liebler <stli@linux.ibm.com> Date: Fri Jul 13 15:58:48 2018 +0200 Fix string/tst-xbzero-opt if build with gcc head. On s390x, the test string/tst-xbzero-opt is failing if build with gcc head: FAIL: no clear/prepare: expected 32 got 0 FAIL: no clear/test: expected some got 0 FAIL: ordinary clear/prepare: expected 32 got 0 INFO: ordinary clear/test: found 0 patterns (memset not eliminated) PASS: explicit clear/prepare: expected 32 got 32 PASS: explicit clear/test: expected 0 got 0 In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop in prepare_test_buffer. Thus count_test_patterns does not find any of the test_pattern. This patch calls use_test_buffer in order to force the compiler to really copy the pattern to buf. ChangeLog: * string/tst-xbzero-opt.c (use_test_buffer): New function. (prepare_test_buffer): Call use_test_buffer as compiler barrier. diff --git a/string/tst-xbzero-opt.c b/string/tst-xbzero-opt.c index cf7041f37a..b00fafd237 100644 --- a/string/tst-xbzero-opt.c +++ b/string/tst-xbzero-opt.c @@ -97,6 +97,17 @@ static const unsigned char test_pattern[16] = static ucontext_t uc_main, uc_co; +static __attribute__ ((noinline)) int +use_test_buffer (unsigned char *buf) +{ + unsigned int sum = 0; + + for (unsigned int i = 0; i < PATTERN_REPS; i++) + sum += buf[i * PATTERN_SIZE]; + + return (sum == 2 * PATTERN_REPS) ? 0 : 1; +} + /* Always check the test buffer immediately after filling it; this makes externally visible side effects depend on the buffer existing and having been filled in. */ @@ -108,6 +119,10 @@ prepare_test_buffer (unsigned char *buf) if (swapcontext (&uc_co, &uc_main)) abort (); + + /* Force the compiler to really copy the pattern to buf. */ + if (use_test_buffer (buf)) + abort (); } static void
On Mon, Jul 16, 2018 at 7:05 AM, Stefan Liebler <stli@linux.ibm.com> wrote: > On 07/12/2018 06:42 PM, Zack Weinberg wrote: >> I would prefer us to find some kind of construct that >> actually does make externally-visible side effects depend on the >> contents of 'buf' in terms of the C abstract machine. >> > > Okay. Then here is a new version of the patch without the empty asm. > If build with GCC head on s390x, setup_no_clear is now filling the buffer > and setup_ordinary_clear is just a tail call to setup_no_clear (same > behaviour as before). Yes, this change is OK. zw
On 07/16/2018 02:36 PM, Zack Weinberg wrote: > On Mon, Jul 16, 2018 at 7:05 AM, Stefan Liebler <stli@linux.ibm.com> wrote: >> On 07/12/2018 06:42 PM, Zack Weinberg wrote: >>> I would prefer us to find some kind of construct that >>> actually does make externally-visible side effects depend on the >>> contents of 'buf' in terms of the C abstract machine. >>> >> >> Okay. Then here is a new version of the patch without the empty asm. >> If build with GCC head on s390x, setup_no_clear is now filling the buffer >> and setup_ordinary_clear is just a tail call to setup_no_clear (same >> behaviour as before). > > Yes, this change is OK. > > zw > Okay. Thank you. Shall I commit this test-fix before or after the release? Bye. Stefan
On Mon, Jul 16, 2018 at 9:13 AM, Stefan Liebler <stli@linux.ibm.com> wrote: > On 07/16/2018 02:36 PM, Zack Weinberg wrote: >> On Mon, Jul 16, 2018 at 7:05 AM, Stefan Liebler <stli@linux.ibm.com> >> wrote: >>> Okay. Then here is a new version of the patch without the empty asm. >>> If build with GCC head on s390x, setup_no_clear is now filling the buffer >>> and setup_ordinary_clear is just a tail call to setup_no_clear (same >>> behaviour as before). >> >> >> Yes, this change is OK. > > Okay. Thank you. > Shall I commit this test-fix before or after the release? Carlos has the last word but we usually accept bugfixes to test cases right up till the hard freeze. zw
On 07/16/2018 03:18 PM, Zack Weinberg wrote: > On Mon, Jul 16, 2018 at 9:13 AM, Stefan Liebler <stli@linux.ibm.com> wrote: >> On 07/16/2018 02:36 PM, Zack Weinberg wrote: >>> On Mon, Jul 16, 2018 at 7:05 AM, Stefan Liebler <stli@linux.ibm.com> >>> wrote: >>>> Okay. Then here is a new version of the patch without the empty asm. >>>> If build with GCC head on s390x, setup_no_clear is now filling the buffer >>>> and setup_ordinary_clear is just a tail call to setup_no_clear (same >>>> behaviour as before). >>> >>> >>> Yes, this change is OK. >> >> Okay. Thank you. >> Shall I commit this test-fix before or after the release? > > Carlos has the last word but we usually accept bugfixes to test cases > right up till the hard freeze. > > zw > Carlos, what about this bugfix for this test? Can I commit it now or shall I wait commit it after the release? Bye Stefan
On 07/16/2018 07:05 AM, Stefan Liebler wrote: > On 07/12/2018 06:42 PM, Zack Weinberg wrote: >> On Thu, Jul 12, 2018 at 10:22 AM, Stefan Liebler <stli@linux.ibm.com> wrote: >>> Fix string/tst-xbzero-opt is build with gcc head. >> ... >>> In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop in >>> prepare_test_buffer. Thus count_test_patterns does not find any of the >>> test_pattern. >>> >>> This patch introduces a compiler barrier just after filling the buffer. >> >> I think I understand why the call to swapcontext in >> prepare_test_buffer is not a sufficient compiler barrier, but I am not >> a fan of asm volatile ("" ::: "memory"), because I fully expect some >> future compiler to decide that there are no actual assembly >> instructions being inserted so the statement can be completely >> ignored. I would prefer us to find some kind of construct that >> actually does make externally-visible side effects depend on the >> contents of 'buf' in terms of the C abstract machine. >> >> zw >> > > Okay. Then here is a new version of the patch without the empty asm. > If build with GCC head on s390x, setup_no_clear is now filling the > buffer and setup_ordinary_clear is just a tail call to setup_no_clear > (same behaviour as before). Thanks for fixing this. > commit a42ce495317df04f1abdc3670c1fcbb68f329c3d > Author: Stefan Liebler <stli@linux.ibm.com> > Date: Fri Jul 13 15:58:48 2018 +0200 > > Fix string/tst-xbzero-opt if build with gcc head. > > On s390x, the test string/tst-xbzero-opt is failing if build with gcc head: > FAIL: no clear/prepare: expected 32 got 0 > FAIL: no clear/test: expected some got 0 > FAIL: ordinary clear/prepare: expected 32 got 0 > INFO: ordinary clear/test: found 0 patterns (memset not eliminated) > PASS: explicit clear/prepare: expected 32 got 32 > PASS: explicit clear/test: expected 0 got 0 > > In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop > in prepare_test_buffer. Thus count_test_patterns does not find any of the > test_pattern. > > This patch calls use_test_buffer in order to force the compiler to really copy > the pattern to buf. OK. > > ChangeLog: > > * string/tst-xbzero-opt.c (use_test_buffer): New function. > (prepare_test_buffer): Call use_test_buffer as compiler barrier. OK for 2.28. Consider noclone also. Reviewed-by: Carlos O'Donell <carlos@redhat.com> > > diff --git a/string/tst-xbzero-opt.c b/string/tst-xbzero-opt.c > index cf7041f37a..b00fafd237 100644 > --- a/string/tst-xbzero-opt.c > +++ b/string/tst-xbzero-opt.c > @@ -97,6 +97,17 @@ static const unsigned char test_pattern[16] = > > static ucontext_t uc_main, uc_co; > > +static __attribute__ ((noinline)) int Suggest: Add noclone. As optimization barriers we have also used "noclone" here e.g. malloc/tst-mallocstate.c (my_free). I can't see a case where a specialization of this function avoids the copy, but it might be possible? Just for safety's sake use noclone too. > +use_test_buffer (unsigned char *buf) > +{ > + unsigned int sum = 0; > + > + for (unsigned int i = 0; i < PATTERN_REPS; i++) > + sum += buf[i * PATTERN_SIZE]; > + > + return (sum == 2 * PATTERN_REPS) ? 0 : 1; > +} OK. > + > /* Always check the test buffer immediately after filling it; this > makes externally visible side effects depend on the buffer existing > and having been filled in. */ > @@ -108,6 +119,10 @@ prepare_test_buffer (unsigned char *buf) > > if (swapcontext (&uc_co, &uc_main)) > abort (); > + > + /* Force the compiler to really copy the pattern to buf. */ > + if (use_test_buffer (buf)) > + abort (); OK. > } > > static void
On 07/26/2018 03:33 PM, Carlos O'Donell wrote: > On 07/16/2018 07:05 AM, Stefan Liebler wrote: >> On 07/12/2018 06:42 PM, Zack Weinberg wrote: >>> On Thu, Jul 12, 2018 at 10:22 AM, Stefan Liebler <stli@linux.ibm.com> wrote: >>>> Fix string/tst-xbzero-opt is build with gcc head. >>> ... >>>> In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop in >>>> prepare_test_buffer. Thus count_test_patterns does not find any of the >>>> test_pattern. >>>> >>>> This patch introduces a compiler barrier just after filling the buffer. >>> >>> I think I understand why the call to swapcontext in >>> prepare_test_buffer is not a sufficient compiler barrier, but I am not >>> a fan of asm volatile ("" ::: "memory"), because I fully expect some >>> future compiler to decide that there are no actual assembly >>> instructions being inserted so the statement can be completely >>> ignored. I would prefer us to find some kind of construct that >>> actually does make externally-visible side effects depend on the >>> contents of 'buf' in terms of the C abstract machine. >>> >>> zw >>> >> >> Okay. Then here is a new version of the patch without the empty asm. >> If build with GCC head on s390x, setup_no_clear is now filling the >> buffer and setup_ordinary_clear is just a tail call to setup_no_clear >> (same behaviour as before). > > Thanks for fixing this. > >> commit a42ce495317df04f1abdc3670c1fcbb68f329c3d >> Author: Stefan Liebler <stli@linux.ibm.com> >> Date: Fri Jul 13 15:58:48 2018 +0200 >> >> Fix string/tst-xbzero-opt if build with gcc head. >> >> On s390x, the test string/tst-xbzero-opt is failing if build with gcc head: >> FAIL: no clear/prepare: expected 32 got 0 >> FAIL: no clear/test: expected some got 0 >> FAIL: ordinary clear/prepare: expected 32 got 0 >> INFO: ordinary clear/test: found 0 patterns (memset not eliminated) >> PASS: explicit clear/prepare: expected 32 got 32 >> PASS: explicit clear/test: expected 0 got 0 >> >> In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop >> in prepare_test_buffer. Thus count_test_patterns does not find any of the >> test_pattern. >> >> This patch calls use_test_buffer in order to force the compiler to really copy >> the pattern to buf. > > OK. > >> >> ChangeLog: >> >> * string/tst-xbzero-opt.c (use_test_buffer): New function. >> (prepare_test_buffer): Call use_test_buffer as compiler barrier. > > OK for 2.28. Consider noclone also. > > Reviewed-by: Carlos O'Donell <carlos@redhat.com> > Okay. I've just committed this patch with noclone. Thanks Stefan >> >> diff --git a/string/tst-xbzero-opt.c b/string/tst-xbzero-opt.c >> index cf7041f37a..b00fafd237 100644 >> --- a/string/tst-xbzero-opt.c >> +++ b/string/tst-xbzero-opt.c >> @@ -97,6 +97,17 @@ static const unsigned char test_pattern[16] = >> >> static ucontext_t uc_main, uc_co; >> >> +static __attribute__ ((noinline)) int > > Suggest: Add noclone. > > As optimization barriers we have also used "noclone" here > e.g. malloc/tst-mallocstate.c (my_free). > > I can't see a case where a specialization of this function avoids > the copy, but it might be possible? Just for safety's sake use noclone > too. > >> +use_test_buffer (unsigned char *buf) >> +{ >> + unsigned int sum = 0; >> + >> + for (unsigned int i = 0; i < PATTERN_REPS; i++) >> + sum += buf[i * PATTERN_SIZE]; >> + >> + return (sum == 2 * PATTERN_REPS) ? 0 : 1; >> +} > > OK. > >> + >> /* Always check the test buffer immediately after filling it; this >> makes externally visible side effects depend on the buffer existing >> and having been filled in. */ >> @@ -108,6 +119,10 @@ prepare_test_buffer (unsigned char *buf) >> >> if (swapcontext (&uc_co, &uc_main)) >> abort (); >> + >> + /* Force the compiler to really copy the pattern to buf. */ >> + if (use_test_buffer (buf)) >> + abort (); > > OK. > >> } >> >> static void >
commit 65f9b078c5053faa93e1f572282463685a869864 Author: Stefan Liebler <stli@linux.ibm.com> Date: Thu Jul 12 16:07:26 2018 +0200 Fix string/tst-xbzero-opt if build with gcc head. On s390x, the test string/tst-xbzero-opt is failing if build with gcc head: FAIL: no clear/prepare: expected 32 got 0 FAIL: no clear/test: expected some got 0 FAIL: ordinary clear/prepare: expected 32 got 0 INFO: ordinary clear/test: found 0 patterns (memset not eliminated) PASS: explicit clear/prepare: expected 32 got 32 PASS: explicit clear/test: expected 0 got 0 In setup_no_clear / setup_ordinary_clear, GCC is omitting the memcpy loop in prepare_test_buffer. Thus count_test_patterns does not find any of the test_pattern. This patch introduces a compiler barrier just after filling the buffer and the filling of the test_pattern is not omitted. ChangeLog: * string/tst-xbzero-opt.c (prepare_test_buffer): Add compiler barrier. diff --git a/string/tst-xbzero-opt.c b/string/tst-xbzero-opt.c index cf7041f37a..4c2be0c197 100644 --- a/string/tst-xbzero-opt.c +++ b/string/tst-xbzero-opt.c @@ -106,6 +106,9 @@ prepare_test_buffer (unsigned char *buf) for (unsigned int i = 0; i < PATTERN_REPS; i++) memcpy (buf + i*PATTERN_SIZE, test_pattern, PATTERN_SIZE); + /* Force the compiler to really copy the pattern to buf. */ + __asm__ __volatile__ ("" ::: "memory"); + if (swapcontext (&uc_co, &uc_main)) abort (); }