Message ID | 7bd3545a-7b9d-a9b2-6923-0d02df809177@netcologne.de |
---|---|
State | New |
Headers | show |
Series | [fortran] Fix common subexpression elimination with IEEE rounding (PR108329) | expand |
Hi Thomas, Following your off-line explanation that the seemingly empty looking assembly line forces an effective reload from memory, all is now clear. OK for mainline and for backporting as you see fit. Thanks for the patch. Paul On Sat, 7 Jan 2023 at 15:46, Thomas Koenig via Fortran <fortran@gcc.gnu.org> wrote: > Hello world, > > this patch fixes Fortran's handling of common subexpression elimination > across ieee_set_rouding_mode calls. It does so using a rather big > hammer, by issuing a memory barrier to force reload from memory > (and thus a recomputation). > > This is a rather big hammer, so if there are more elegant ways > to fix it, I am very much open to suggestions. > > If PR 34678 is fixed, then this solution can also be applied here. > > OK for trunk? How do you feel about a backport? > > Best regards > > Thomas > > Add memory barrier for calls to ieee_set_rounding_mode. > > gcc/fortran/ChangeLog: > > PR fortran/108329 > * trans-expr.cc (trans_memory_barrier): New functions. > (gfc_conv_procedure_call): Insert memory barrier for > ieee_set_rounding_mode. > > gcc/testsuite/ChangeLog: > > PR fortran/108329 > * gfortran.dg/rounding_4.f90: New test.
> Am 08.01.2023 um 14:31 schrieb Paul Richard Thomas via Fortran <fortran@gcc.gnu.org>: > > Hi Thomas, > > Following your off-line explanation that the seemingly empty looking > assembly line forces an effective reload from memory, all is now clear. It’s not a full fix (for register vars) and it’s ‚superior‘ to the call itself only because asm handling is implemented in a rather stupid way in the Alias oracle. So I don’t think this is a „fix“ at all. Richard > OK for mainline and for backporting as you see fit. > > Thanks for the patch. > > Paul > > >> On Sat, 7 Jan 2023 at 15:46, Thomas Koenig via Fortran <fortran@gcc.gnu.org> >> wrote: >> >> Hello world, >> >> this patch fixes Fortran's handling of common subexpression elimination >> across ieee_set_rouding_mode calls. It does so using a rather big >> hammer, by issuing a memory barrier to force reload from memory >> (and thus a recomputation). >> >> This is a rather big hammer, so if there are more elegant ways >> to fix it, I am very much open to suggestions. >> >> If PR 34678 is fixed, then this solution can also be applied here. >> >> OK for trunk? How do you feel about a backport? >> >> Best regards >> >> Thomas >> >> Add memory barrier for calls to ieee_set_rounding_mode. >> >> gcc/fortran/ChangeLog: >> >> PR fortran/108329 >> * trans-expr.cc (trans_memory_barrier): New functions. >> (gfc_conv_procedure_call): Insert memory barrier for >> ieee_set_rounding_mode. >> >> gcc/testsuite/ChangeLog: >> >> PR fortran/108329 >> * gfortran.dg/rounding_4.f90: New test. > > > > -- > "If you can't explain it simply, you don't understand it well enough" - > Albert Einstein
Hi Richard, >> Am 08.01.2023 um 14:31 schrieb Paul Richard Thomas via Fortran <fortran@gcc.gnu.org>: >> >> Hi Thomas, >> >> Following your off-line explanation that the seemingly empty looking >> assembly line forces an effective reload from memory, all is now clear. > > It’s not a full fix (for register vars) and it’s ‚superior‘ to the call itself only because asm handling is implemented in a rather stupid way in the Alias oracle. So I don’t think this is a „fix“ at all. There are no register variables in Fortran, this is Fortran FE only, and it is a fix in the sense that correct code is no longer miscompiled. There's a FIXME in the code pointing to the relevant PR precisely because I think that this is less than elegant (as do you, obviously). Do you have other suggestions how to implement this? If PR 34678 is solved, this would probably provide a mechanism that we could simply re-use. Best regards Thomas
On Sun, Jan 8, 2023 at 5:21 PM Thomas Koenig <tkoenig@netcologne.de> wrote: > > Hi Richard, > > >> Am 08.01.2023 um 14:31 schrieb Paul Richard Thomas via Fortran <fortran@gcc.gnu.org>: > >> > >> Hi Thomas, > >> > >> Following your off-line explanation that the seemingly empty looking > >> assembly line forces an effective reload from memory, all is now clear. > > > > It’s not a full fix (for register vars) and it’s ‚superior‘ to the call itself only because asm handling is implemented in a rather stupid way in the Alias oracle. So I don’t think this is a „fix“ at all. > > There are no register variables in Fortran, this is Fortran FE only, > and it is a fix in the sense that correct code is no longer miscompiled. It's a quite big hammer and the fact that it "works" is just luck and the fact that the memory barrier implied by the ieee_set_rouding_mode does not is because by-reference passed arguments are marked by the frontend so they can be CSEd since memory barriers may not affect them. As said, the fact that this "works" is just because we're lazy on GIMPLE: /* If the statement STMT may clobber the memory reference REF return true, otherwise return false. */ bool stmt_may_clobber_ref_p_1 (gimple *stmt, ao_ref *ref, bool tbaa_p) { ... else if (gimple_code (stmt) == GIMPLE_ASM) return true; > There's a FIXME in the code pointing to the relevant PR precisely > because I think that this is less than elegant (as do you, obviously). > Do you have other suggestions how to implement this? If PR 34678 > is solved, this would probably provide a mechanism that we could > simply re-use. There is no reliable way to get this correct at the moment and if there were good and easy ways to get this working they'd be implemented already. Richard. > Best regards > > Thomas
Hi Richard, > There is no reliable way to get this correct at the moment and if there > were good and easy ways to get this working they'd be implemented already. OK, I then withdraw the patch (and have unassigned myself from the PR). Best regards Thomas
diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc index 4f3ae82d39c..29be7804e11 100644 --- a/gcc/fortran/trans-expr.cc +++ b/gcc/fortran/trans-expr.cc @@ -5981,6 +5981,20 @@ post_call: gfc_add_block_to_block (&parmse->post, &block); } +/* Helper function - generate a memory barrier. */ + +static tree +trans_memory_barrier (void) +{ + tree tmp; + + tmp = gfc_build_string_const (sizeof ("memory"), "memory"); + tmp = build5_loc (input_location, ASM_EXPR, void_type_node, + gfc_build_string_const (1, ""), NULL_TREE, NULL_TREE, + tree_cons (NULL_TREE, tmp, NULL_TREE), NULL_TREE); + ASM_VOLATILE_P (tmp) = 1; + return tmp; +} /* Generate code for a procedure call. Note can return se->post != NULL. If se->direct_byref is set then se->expr contains the return parameter. @@ -7692,6 +7706,19 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym, else conv_base_obj_fcn_val (se, base_object, expr); + /* FIXME: Special handing of ieee_set_rounding_mode - we clobber + memory here to avoid common subexpression moving code past calls + to ieee_set_rounding_mode. This should only be done for + floating point, but currently gcc offers no other possibility. + See PR 108329. */ + + if (sym->from_intmod == INTMOD_IEEE_ARITHMETIC + && strcmp (sym->name, "ieee_set_rounding_mode") == 0) + { + tree tmp = trans_memory_barrier (); + gfc_add_expr_to_block (&post, tmp); + } + /* If there are alternate return labels, function type should be integer. Can't modify the type in place though, since it can be shared with other functions. For dummy arguments, the typing is done to diff --git a/gcc/testsuite/gfortran.dg/rounding_4.f90 b/gcc/testsuite/gfortran.dg/rounding_4.f90 new file mode 100644 index 00000000000..e8799da67dc --- /dev/null +++ b/gcc/testsuite/gfortran.dg/rounding_4.f90 @@ -0,0 +1,31 @@ +! { dg-do run } +module y + implicit none + integer, parameter :: wp = selected_real_kind(15) +contains + subroutine foo(a,b,c) + use ieee_arithmetic + real(kind=wp), dimension(4), intent(out) :: a + real(kind=wp), intent(in) :: b, c + type (ieee_round_type), dimension(4), parameter :: mode = & + [ieee_nearest, ieee_to_zero, ieee_up, ieee_down] + call ieee_set_rounding_mode (mode(1)) + a(1) = b + c + call ieee_set_rounding_mode (mode(2)) + a(2) = b + c + call ieee_set_rounding_mode (mode(3)) + a(3) = b + c + call ieee_set_rounding_mode (mode(4)) + a(4) = b + c + end subroutine foo +end module y + +program main + use y + real(kind=wp), dimension(4) :: a + call foo(a,0.1_wp,0.2_wp) + if (a(1) <= a(2)) stop 1 + if (a(3) <= a(4)) stop 2 + if (a(1) /= a(3)) stop 3 + if (a(2) /= a(4)) stop 4 +end program main