Message ID | 20230926143439.B589920431@pchp3.se.axis.com |
---|---|
State | New |
Headers | show |
Series | __atomic_test_and_set: Fall back to library, not non-atomic code | expand |
On 9/26/23 08:34, Hans-Peter Nilsson wrote: > Tested cris-elf, native x86_64-pc-linux-gnu and arm-eabi. > > For arm-eabi, notably lacking any atomic support for the > default multilib, with --target_board=arm-sim it regressed > 29_atomics/atomic_flag/cons/value_init.cc with the expected > linker failure due to lack of __atomic_test_and_set - which > is a good thing. With this one, there are 44 unexpected > FAILs for libstdc+++ at r14-4210-g94982a6b9cf4. This number > was 206 as late as r14-3470-g721f7e2c4e5e, but mitigated by > r14-3980-g62b29347c38394, deliberately. To fix the > regression, I'll do the same and follow up with adding > dg-require-thread-fence on > 29_atomics/atomic_flag/cons/value_init.cc (and if approved, > commit it before this one). > > Incidentally, the fortran test-results for arm-eabi are > riddled with missing-__sync_synchronize linker errors > causing some 18134 unexpected failures, where cris-elf has > 121. > > Ok to commit? > > -- >8 -- > Make __atomic_test_and_set consistent with other __atomic_ and __sync_ > builtins: call a matching library function instead of emitting > non-atomic code when the target has no direct insn support. > > There's special-case code handling targetm.atomic_test_and_set_trueval > != 1 trying a modified maybe_emit_sync_lock_test_and_set. Previously, > if that worked but its matching emit_store_flag_force returned NULL, > we'd segfault later on. Now that the caller handles NULL, gcc_assert > here instead. > > While the referenced PR:s are ARM-specific, the issue is general. > > PR target/107567 > PR target/109166 > * builtins.cc (expand_builtin) <case BUILT_IN_ATOMIC_TEST_AND_SET>: > Handle failure from expand_builtin_atomic_test_and_set. > * optabs.cc (expand_atomic_test_and_set): When all attempts fail to > generate atomic code through target support, return NULL > instead of emitting non-atomic code. Also, for code handling > targetm.atomic_test_and_set_trueval != 1, gcc_assert result > from calling emit_store_flag_force instead of returning NULL. OK jeff
Hi! On Tue, 26 Sept 2023 at 16:34, Hans-Peter Nilsson <hp@axis.com> wrote: > Tested cris-elf, native x86_64-pc-linux-gnu and arm-eabi. > > For arm-eabi, notably lacking any atomic support for the > default multilib, with --target_board=arm-sim it regressed > 29_atomics/atomic_flag/cons/value_init.cc with the expected > linker failure due to lack of __atomic_test_and_set - which > is a good thing. With this one, there are 44 unexpected > FAILs for libstdc+++ at r14-4210-g94982a6b9cf4. This number > was 206 as late as r14-3470-g721f7e2c4e5e, but mitigated by > r14-3980-g62b29347c38394, deliberately. To fix the > regression, I'll do the same and follow up with adding > dg-require-thread-fence on > 29_atomics/atomic_flag/cons/value_init.cc (and if approved, > commit it before this one). > > Incidentally, the fortran test-results for arm-eabi are > riddled with missing-__sync_synchronize linker errors > causing some 18134 unexpected failures, where cris-elf has > 121. > > The patch passed almost all our CI configurations, except arm-eabi when testing with -mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto where is causes these failures: FAIL: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 (test for excess errors) UNRESOLVED: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 compilation failed to produce executable FAIL: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++20 (test for excess errors) UNRESOLVED: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++20 compilation failed to produce executable FAIL: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++26 (test for excess errors) UNRESOLVED: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++26 compilation failed to produce executable FAIL: 29_atomics/atomic_flag/test_and_set/explicit.cc -std=gnu++17 (test for excess errors) UNRESOLVED: 29_atomics/atomic_flag/test_and_set/explicit.cc -std=gnu++17 compilation failed to produce executable FAIL: 29_atomics/atomic_flag/test_and_set/implicit.cc -std=gnu++17 (test for excess errors) UNRESOLVED: 29_atomics/atomic_flag/test_and_set/implicit.cc -std=gnu++17 compilation failed to produce executable The linker error is: undefined reference to `__atomic_test_and_set' Maybe we need a new variant of dg-require-thread-fence ? Thanks, Christophe Ok to commit? > > -- >8 -- > Make __atomic_test_and_set consistent with other __atomic_ and __sync_ > builtins: call a matching library function instead of emitting > non-atomic code when the target has no direct insn support. > > There's special-case code handling targetm.atomic_test_and_set_trueval > != 1 trying a modified maybe_emit_sync_lock_test_and_set. Previously, > if that worked but its matching emit_store_flag_force returned NULL, > we'd segfault later on. Now that the caller handles NULL, gcc_assert > here instead. > > While the referenced PR:s are ARM-specific, the issue is general. > > PR target/107567 > PR target/109166 > * builtins.cc (expand_builtin) <case BUILT_IN_ATOMIC_TEST_AND_SET>: > Handle failure from expand_builtin_atomic_test_and_set. > * optabs.cc (expand_atomic_test_and_set): When all attempts fail to > generate atomic code through target support, return NULL > instead of emitting non-atomic code. Also, for code handling > targetm.atomic_test_and_set_trueval != 1, gcc_assert result > from calling emit_store_flag_force instead of returning NULL. > --- > gcc/builtins.cc | 5 ++++- > gcc/optabs.cc | 22 +++++++--------------- > 2 files changed, 11 insertions(+), 16 deletions(-) > > diff --git a/gcc/builtins.cc b/gcc/builtins.cc > index 6e4274bb2a4e..40dfd36a3197 100644 > --- a/gcc/builtins.cc > +++ b/gcc/builtins.cc > @@ -8387,7 +8387,10 @@ expand_builtin (tree exp, rtx target, rtx > subtarget, machine_mode mode, > break; > > case BUILT_IN_ATOMIC_TEST_AND_SET: > - return expand_builtin_atomic_test_and_set (exp, target); > + target = expand_builtin_atomic_test_and_set (exp, target); > + if (target) > + return target; > + break; > > case BUILT_IN_ATOMIC_CLEAR: > return expand_builtin_atomic_clear (exp); > diff --git a/gcc/optabs.cc b/gcc/optabs.cc > index 8b96f23aec05..e1898da22808 100644 > --- a/gcc/optabs.cc > +++ b/gcc/optabs.cc > @@ -7080,25 +7080,17 @@ expand_atomic_test_and_set (rtx target, rtx mem, > enum memmodel model) > /* Recall that the legacy lock_test_and_set optab was allowed to do > magic > things with the value 1. Thus we try again without trueval. */ > if (!ret && targetm.atomic_test_and_set_trueval != 1) > - ret = maybe_emit_sync_lock_test_and_set (subtarget, mem, const1_rtx, > model); > - > - /* Failing all else, assume a single threaded environment and simply > - perform the operation. */ > - if (!ret) > { > - /* If the result is ignored skip the move to target. */ > - if (subtarget != const0_rtx) > - emit_move_insn (subtarget, mem); > + ret = maybe_emit_sync_lock_test_and_set (subtarget, mem, > const1_rtx, model); > > - emit_move_insn (mem, trueval); > - ret = subtarget; > + if (ret) > + { > + /* Rectify the not-one trueval. */ > + ret = emit_store_flag_force (target, NE, ret, const0_rtx, mode, > 0, 1); > + gcc_assert (ret); > + } > } > > - /* Recall that have to return a boolean value; rectify if trueval > - is not exactly one. */ > - if (targetm.atomic_test_and_set_trueval != 1) > - ret = emit_store_flag_force (target, NE, ret, const0_rtx, mode, 0, 1); > - > return ret; > } > > -- > 2.30.2 > >
> From: Christophe Lyon <christophe.lyon@linaro.org> > Date: Tue, 3 Oct 2023 15:20:39 +0200 > The patch passed almost all our CI configurations, except arm-eabi when > testing with > -mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto > where is causes these failures: > FAIL: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 (test for excess > errors) > UNRESOLVED: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 compilation > failed to produce executable > FAIL: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++20 (test for > excess errors) > UNRESOLVED: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++20 > compilation failed to produce executable > FAIL: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++26 (test for > excess errors) > UNRESOLVED: 29_atomics/atomic_flag/cons/value_init.cc -std=gnu++26 > compilation failed to produce executable > FAIL: 29_atomics/atomic_flag/test_and_set/explicit.cc -std=gnu++17 (test > for excess errors) > UNRESOLVED: 29_atomics/atomic_flag/test_and_set/explicit.cc -std=gnu++17 > compilation failed to produce executable > FAIL: 29_atomics/atomic_flag/test_and_set/implicit.cc -std=gnu++17 (test > for excess errors) > UNRESOLVED: 29_atomics/atomic_flag/test_and_set/implicit.cc -std=gnu++17 > compilation failed to produce executable For which set of multilibs in that set, do you get these errors? I'm guessing -march=armv6s-m, but I'm checking. > The linker error is: > undefined reference to `__atomic_test_and_set' I read that as you're saying you have a multilib combination where you currently don't emit __sync_synchronize but also don't emit anything for __atomic_test_and_set. > Maybe we need a new variant of dg-require-thread-fence ? Perhaps. Unless of course, there's a multilib combination for which you *can* emit the proper atomic spell; missing it because the need for it, has been hidden! (At first I thought it was related to caching the thread-fence property across multilib testing, but I don't think that was correct.) > > Thanks, > > Christophe > > > Ok to commit? ENOPATCH brgds, H-P
On Wed, 4 Oct 2023 at 02:49, Hans-Peter Nilsson <hp@axis.com> wrote: > (Just before sending, I noticed you replied off-list; I > won't add back gcc-patches to cc here myself, but please > feel free to do it, if you choose to reply.) > Sorry, it was a typo of mine, I meant to reply to the list > > > From: Christophe Lyon <christophe.lyon@linaro.org> > > Date: Tue, 3 Oct 2023 18:06:21 +0200 > > > On Tue, 3 Oct 2023 at 17:16, Hans-Peter Nilsson <hp@axis.com> wrote: > > > > > > From: Christophe Lyon <christophe.lyon@linaro.org> > > > > Date: Tue, 3 Oct 2023 15:20:39 +0200 > > > > > > > The patch passed almost all our CI configurations, except arm-eabi > when > > > > testing with > > > > -mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto > > > > where is causes these failures: > > > > FAIL: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 (test for excess > > > > errors) > > > > UNRESOLVED: 29_atomics/atomic_flag/clear/1.cc -std=gnu++17 > compilation > > > > failed to produce executable > [...] > > > For which set of multilibs in that set, do you get these > > > errors? I'm guessing -march=armv6s-m, but I'm checking. > > > > > > > Not sure to understand the question? > > By your "testing with > -mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto" > I presume you mean "testing with make check > > 'RUNTESTFLAGS=--target_board=arm-sim/-mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto' > (as that's where you usually put that expression) > Yes > > - but I misremembered what "/" means in RUNTESTFLAGS (it's > combining options in one single set of options passed to > gcc, not delimiting separate options used to form > combinations). Sorry for the confusion! > I see. Actually I always find "multilib" confusing in the context of running the tests, where in fact we generally mean we add some flags in RUNTESTFLAGS and/or --target_board. To me, multilib refers to the set of lib variants we build, but I noticed "multilib" is often used in the context of running the tests... > > > > > Maybe we need a new variant of dg-require-thread-fence ? > > > > > > Perhaps. > > Or rather: most certainly. IIUC, ARMv6 (or whatever you > prefer to call it) can load and store atomically, but only > as separate events; it can't atomically exchange a stored > value and therefore arm-eabi calls out to a library > function. > In this case, it's armv6s-m (which is different from armv6....) And yes, it seems so, as your patch showed, assuming there's no bug in the target description ;-) > I think I'll help and replace the obvious uses of > dg-require-thread-fence where actually an atomic exchange is > required; replacing those with a new directive > dg-require-atomic-exchange. That will however not be *all* > places where such a guard should be added. > Indeed. > I also see lots of undefined references to *other* outlined > atomic builtins, for example __atomic_compare_exchange_4 and > __atomic_fetch_add_4. Though, those likely aren't > regressions. I understand you focus on regressions here. > Yes, my reply to your patch was meant to look at the regressions. As a separate action, I plan to look at the remaining existing such failures. > By the way, how do you test; what simulator, what baseboard > file? Please share! Also, please send *some* > contrib/test_summary reports for arm-eabi to > gcc-testresults@ every now and then. (But also, please > We use qemu, with qemu.exp from: https://git.linaro.org/toolchain/abe.git/tree/config/boards/qemu.exp nothing fancy ;-) > don't post multiple results several times a day for similar > configurations. Looking at you, powerpc people!) > We have plans to restart sending such results, like I was doing several years ago (with many results every day, too ;-)) > I can't test *anything* newer than default arm-eabi (armv4t) > on arm-sim (the one next to gdb), or else execution tests > get lost and time out while also complaining about "Unknown > machine type". I noticed there's a qemu.exp in ToT dejagnu, > but it doesn't work for arm-eabi for at least two reasons. > (I might get to that yak later, I just take it as a sign > that qemu-arm isn't what I look for.) We do use qemu-arm, depending on how you want to test, maybe you need to add a -cpu flag such that it supports the required instructions. > > Unless of course, there's a multilib combination > > > for which you *can* emit the proper atomic spell; missing it > > > because the need for it, has been hidden! > > > > > > (At first I thought it was related to caching the > > > thread-fence property across multilib testing, but I don't > > > think that was correct.) > > > > > Not sure what you mean? We run the tests for a single multilib here > > (mthumb/-march=armv6s-m/-mtune=cortex-m0/-mfloat-abi=soft/-mfpu=auto) > > so the cached value should always be correct? > > Yeah, part of my RUNTESTFLAGS confusion per above, please > ignore that. > > brgds, H-P >
diff --git a/gcc/builtins.cc b/gcc/builtins.cc index 6e4274bb2a4e..40dfd36a3197 100644 --- a/gcc/builtins.cc +++ b/gcc/builtins.cc @@ -8387,7 +8387,10 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode, break; case BUILT_IN_ATOMIC_TEST_AND_SET: - return expand_builtin_atomic_test_and_set (exp, target); + target = expand_builtin_atomic_test_and_set (exp, target); + if (target) + return target; + break; case BUILT_IN_ATOMIC_CLEAR: return expand_builtin_atomic_clear (exp); diff --git a/gcc/optabs.cc b/gcc/optabs.cc index 8b96f23aec05..e1898da22808 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -7080,25 +7080,17 @@ expand_atomic_test_and_set (rtx target, rtx mem, enum memmodel model) /* Recall that the legacy lock_test_and_set optab was allowed to do magic things with the value 1. Thus we try again without trueval. */ if (!ret && targetm.atomic_test_and_set_trueval != 1) - ret = maybe_emit_sync_lock_test_and_set (subtarget, mem, const1_rtx, model); - - /* Failing all else, assume a single threaded environment and simply - perform the operation. */ - if (!ret) { - /* If the result is ignored skip the move to target. */ - if (subtarget != const0_rtx) - emit_move_insn (subtarget, mem); + ret = maybe_emit_sync_lock_test_and_set (subtarget, mem, const1_rtx, model); - emit_move_insn (mem, trueval); - ret = subtarget; + if (ret) + { + /* Rectify the not-one trueval. */ + ret = emit_store_flag_force (target, NE, ret, const0_rtx, mode, 0, 1); + gcc_assert (ret); + } } - /* Recall that have to return a boolean value; rectify if trueval - is not exactly one. */ - if (targetm.atomic_test_and_set_trueval != 1) - ret = emit_store_flag_force (target, NE, ret, const0_rtx, mode, 0, 1); - return ret; }