Message ID | 20240617140501.32CF5139AB@imap1.dmz-prg2.suse.org |
---|---|
State | New |
Headers | show |
Series | [v2] Enhance if-conversion for automatic arrays | expand |
On 6/17/24 16:05, Richard Biener wrote: > Automatic arrays that are not address-taken should not be subject to > store data races. This applies to OMP SIMD in-branch lowered > functions result array which for the testcase otherwise prevents > vectorization with SSE and for AVX and AVX512 ends up with spurious > .MASK_STORE to the stack surviving. Does this also apply for "automatic arrays" as defined by the Fortran Standard (see https://j3-fortran.org/doc/year/23/23-007r1.pdf, page 104), i.e., outside of the OMP_SIMD construct ? In gfortran, when using the option -fstack-arrays, they are assigned memory space on the stack. Kind regards,
> Am 19.06.2024 um 20:25 schrieb Toon Moene <toon@moene.org>: > > On 6/17/24 16:05, Richard Biener wrote: > >> Automatic arrays that are not address-taken should not be subject to >> store data races. This applies to OMP SIMD in-branch lowered >> functions result array which for the testcase otherwise prevents >> vectorization with SSE and for AVX and AVX512 ends up with spurious >> .MASK_STORE to the stack surviving. > > Does this also apply for "automatic arrays" as defined by the Fortran Standard (see https://j3-fortran.org/doc/year/23/23-007r1.pdf, page 104), i.e., outside of the OMP_SIMD construct ? > > In gfortran, when using the option -fstack-arrays, they are assigned memory space on the stack. I’d say yes though the likelihood those are address taken and thus not considered is high. The main target were the arrays created as part of the SIMD lowering. Richard > Kind regards, > > -- > Toon Moene - e-mail: toon@moene.org - phone: +31 346 214290 > Saturnushof 14, 3738 XG Maartensdijk, The Netherlands >
On 6/19/24 21:06, Richard Biener wrote: > > >> Am 19.06.2024 um 20:25 schrieb Toon Moene <toon@moene.org>: >> >> On 6/17/24 16:05, Richard Biener wrote: >> >>> Automatic arrays that are not address-taken should not be subject to >>> store data races. This applies to OMP SIMD in-branch lowered >>> functions result array which for the testcase otherwise prevents >>> vectorization with SSE and for AVX and AVX512 ends up with spurious >>> .MASK_STORE to the stack surviving. >> >> Does this also apply for "automatic arrays" as defined by the Fortran Standard (see https://j3-fortran.org/doc/year/23/23-007r1.pdf, page 104), i.e., outside of the OMP_SIMD construct ? >> >> In gfortran, when using the option -fstack-arrays, they are assigned memory space on the stack. > > I’d say yes though the likelihood those are address taken and thus not considered is high. The main target were the arrays created as part of the SIMD lowering. Isn't there a "not" missing before "high" ? So it mostly helps after the call to subroutine 'sub' in the following: SUBROUTINE AAP(A, B, N) INTEGER N REAL A(N), B(N), R(N) CALL SUB(R, N) ! Address of R passed to SUB R = ABS(A) B = R B = SQRT(A) END ?
diff --git a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-21.c b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-21.c new file mode 100644 index 00000000000..49c52fb59bd --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-21.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_simd_clones } */ +/* { dg-additional-options "-fopenmp-simd" } */ + +#pragma omp declare simd simdlen(4) inbranch +__attribute__((noinline)) int +foo (int a, int b) +{ + return a + b; +} + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" { target i?86-*-* x86_64-*-* } } } */ +/* if-conversion shouldn't need to resort to masked stores for the result + array created by OMP lowering since that's automatic and does not have + its address taken. */ +/* { dg-final { scan-tree-dump-not "MASK_STORE" "vect" } } */ diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc index c4c3ed41a44..57992b6deca 100644 --- a/gcc/tree-if-conv.cc +++ b/gcc/tree-if-conv.cc @@ -936,12 +936,11 @@ ifcvt_memrefs_wont_trap (gimple *stmt, vec<data_reference_p> drs) /* an unconditionaly write won't trap if the base is written to unconditionally. */ - if (base_master_dr - && DR_BASE_W_UNCONDITIONALLY (*base_master_dr)) - return flag_store_data_races; - /* or the base is known to be not readonly. */ - else if (base_object_writable (DR_REF (a))) - return flag_store_data_races; + if ((base_master_dr + && DR_BASE_W_UNCONDITIONALLY (*base_master_dr)) + /* or the base is known to be not readonly. */ + || base_object_writable (DR_REF (a))) + return !ref_can_have_store_data_races (base); } return false; diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc index 1a91d63a31e..fab048b0b59 100644 --- a/gcc/tree-ssa-alias.cc +++ b/gcc/tree-ssa-alias.cc @@ -3704,6 +3704,25 @@ stmt_kills_ref_p (gimple *stmt, tree ref) return stmt_kills_ref_p (stmt, &r); } +/* Return whether REF can be subject to store data races. */ + +bool +ref_can_have_store_data_races (tree ref) +{ + /* With -fallow-store-data-races do not care about them. */ + if (flag_store_data_races) + return false; + + tree base = get_base_address (ref); + if (auto_var_p (base) + && ! may_be_aliased (base)) + /* Automatic variables not aliased are not subject to + data races. */ + return false; + + return true; +} + /* Walk the virtual use-def chain of VUSE until hitting the virtual operand TARGET or a statement clobbering the memory reference REF in which diff --git a/gcc/tree-ssa-alias.h b/gcc/tree-ssa-alias.h index 5cd64e72295..5834533ae9c 100644 --- a/gcc/tree-ssa-alias.h +++ b/gcc/tree-ssa-alias.h @@ -144,6 +144,8 @@ extern bool call_may_clobber_ref_p (gcall *, tree, bool = true); extern bool call_may_clobber_ref_p_1 (gcall *, ao_ref *, bool = true); extern bool stmt_kills_ref_p (gimple *, tree); extern bool stmt_kills_ref_p (gimple *, ao_ref *); +extern bool ref_can_have_store_data_races (tree); + enum translate_flags { TR_TRANSLATE, TR_VALUEIZE_AND_DISAMBIGUATE, TR_DISAMBIGUATE }; extern tree get_continuation_for_phi (gimple *, ao_ref *, bool, diff --git a/gcc/tree-ssa-loop-im.cc b/gcc/tree-ssa-loop-im.cc index f3fda2bd7ce..3acbd886a0d 100644 --- a/gcc/tree-ssa-loop-im.cc +++ b/gcc/tree-ssa-loop-im.cc @@ -2298,7 +2298,7 @@ execute_sm (class loop *loop, im_mem_ref *ref, bool always_stored = ref_always_accessed_p (loop, ref, true); if (maybe_mt && (bb_in_transaction (loop_preheader_edge (loop)->src) - || (! flag_store_data_races && ! always_stored))) + || (ref_can_have_store_data_races (ref->mem.ref) && ! always_stored))) multi_threaded_model_p = true; if (multi_threaded_model_p && !use_other_flag_var) diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc index 0ef42a1031a..f05ca727503 100644 --- a/gcc/tree-ssa-phiopt.cc +++ b/gcc/tree-ssa-phiopt.cc @@ -3343,9 +3343,7 @@ cond_store_replacement (basic_block middle_bb, basic_block join_bb, /* If LHS is an access to a local variable without address-taken (or when we allow data races) and known not to trap, we could always safely move down the store. */ - tree base = get_base_address (lhs); - if (!auto_var_p (base) - || (TREE_ADDRESSABLE (base) && !flag_store_data_races) + if (ref_can_have_store_data_races (lhs) || tree_could_trap_p (lhs)) return false; }