From patchwork Wed Jul 31 20:07:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikael Morin X-Patchwork-Id: 1967296 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=orange.fr header.i=@orange.fr header.a=rsa-sha256 header.s=t20230301 header.b=odkFqEwS; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WZ3Cs5x00z1ybX for ; Thu, 1 Aug 2024 06:12:09 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C79863858282 for ; Wed, 31 Jul 2024 20:12:07 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp.smtpout.orange.fr (smtp-30.smtpout.orange.fr [80.12.242.30]) by sourceware.org (Postfix) with ESMTPS id 2F1453858294; Wed, 31 Jul 2024 20:07:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2F1453858294 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=orange.fr Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=orange.fr ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2F1453858294 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=80.12.242.30 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722456472; cv=none; b=AZxufZokXbxHT+aSiYNot2mSCyQRXbbECfqo0Zg/q2z0r+Ngc6OsxePf/vijPhbAyE9dz8y921fljbLCCsdPbU4xaedJfLN7PpsKLSFajyabPRmHrHOv61j/4ygepX4ZaCNs9r08dQC6oxqFhsN/4Qgu+oWhZhe8NJYZ9ZDEzbw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722456472; c=relaxed/simple; bh=Ute6IM93JUWGiP2yZUFtPdovrSxnpelgpZa+oY334n8=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=MRMct82GTQ/kJoAgmI9J+8L25KR0FmwUzoR7O3RSyZZL2VFXoFKOn8j+jUbeE+mWUSKs2TORzk4UVe2hKySQcAirUHzNMnWIiWL2gZBN03MggSECPFbHNwUBbLQOf+qD4rhlVbT7++bFwOyenbm99Fe5QJKaeuIcbI6D/xn/sw8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from cyrano.home ([86.215.161.51]) by smtp.orange.fr with ESMTPA id ZFbisBHiIYjQzZFbqsqYRM; Wed, 31 Jul 2024 22:07:43 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=orange.fr; s=t20230301; t=1722456463; bh=bWDCPwbSrH9zBSzqitQrcdcsWJY5FfCsoyxv2uuH3Ss=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=odkFqEwSjH0+x2JlHst+QRhdtrjRnL9lxyDyjJWr+HrM0D2Yvjz6bKfFMyO17hkhg DE/BMAYPTUblKuuxuXhVT6yPniB+OPrhU5DyqZxo9vFM0xsBXVOclLJrT4+EEVlVXz 6L5iPFT4zvkoblQhBhwEMNDPgZSZZoQqUIYv7Q2TUyzUZFZm5egxyF9Z0c5RXPy6H+ 1bdKNrk28y2po8kD9XQGxJXJvXb7vaSPTTSyYdH7Whqa7hpEYlwzMS+Zfw+NFQ+hsn u1ccVXYplZOfSeQjbWGIxetgYkKjO8jelDZG8vEtReErDJA6yMvpv86TlG8xNNwAMb HULpHama2+J3w== X-ME-Helo: cyrano.home X-ME-Auth: bW9yaW4tbWlrYWVsQG9yYW5nZS5mcg== X-ME-Date: Wed, 31 Jul 2024 22:07:43 +0200 X-ME-IP: 86.215.161.51 From: Mikael Morin To: gcc-patches@gcc.gnu.org, fortran@gcc.gnu.org Subject: [PATCH 1/8] fortran: Add tests covering inline MINLOC/MAXLOC without DIM [PR90608] Date: Wed, 31 Jul 2024 22:07:28 +0200 Message-ID: <20240731200735.229898-2-morin-mikael@orange.fr> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240731200735.229898-1-morin-mikael@orange.fr> References: <20240731200735.229898-1-morin-mikael@orange.fr> MIME-Version: 1.0 X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org From: Mikael Morin Tested on x86_64-pc-linux-gnu. OK for master? -- >8 -- Add the tests covering the various cases for which we are about to implement inline expansion of MINLOC and MAXLOC. Those are cases where the DIM argument is not present. PR fortran/90608 gcc/testsuite/ChangeLog: * gfortran.dg/maxloc_7.f90: New test. * gfortran.dg/maxloc_with_mask_1.f90: New test. * gfortran.dg/minloc_8.f90: New test. * gfortran.dg/minloc_with_mask_1.f90: New test. --- gcc/testsuite/gfortran.dg/maxloc_7.f90 | 220 ++++++++++ .../gfortran.dg/maxloc_with_mask_1.f90 | 393 ++++++++++++++++++ gcc/testsuite/gfortran.dg/minloc_8.f90 | 220 ++++++++++ .../gfortran.dg/minloc_with_mask_1.f90 | 392 +++++++++++++++++ 4 files changed, 1225 insertions(+) create mode 100644 gcc/testsuite/gfortran.dg/maxloc_7.f90 create mode 100644 gcc/testsuite/gfortran.dg/maxloc_with_mask_1.f90 create mode 100644 gcc/testsuite/gfortran.dg/minloc_8.f90 create mode 100644 gcc/testsuite/gfortran.dg/minloc_with_mask_1.f90 diff --git a/gcc/testsuite/gfortran.dg/maxloc_7.f90 b/gcc/testsuite/gfortran.dg/maxloc_7.f90 new file mode 100644 index 00000000000..a875083052a --- /dev/null +++ b/gcc/testsuite/gfortran.dg/maxloc_7.f90 @@ -0,0 +1,220 @@ +! { dg-do run } +! +! PR fortran/90608 +! Check the correct behaviour of the inline maxloc implementation, +! when there is no optional argument. + +program p + implicit none + integer, parameter :: data5(*) = (/ 1, 7, 2, 7, 0 /) + integer, parameter :: data64(*) = (/ 2, 5, 4, 6, 0, 9, 3, 5, & + 4, 4, 1, 7, 3, 2, 1, 2, & + 5, 4, 6, 0, 9, 3, 5, 4, & + 4, 1, 7, 3, 2, 1, 2, 5, & + 4, 6, 0, 9, 3, 5, 4, 4, & + 1, 7, 3, 2, 1, 2, 5, 4, & + 6, 0, 9, 3, 5, 4, 4, 1, & + 7, 3, 2, 1, 2, 5, 4, 6 /) + call check_int_const_shape_rank_1 + call check_int_const_shape_rank_3 + call check_int_const_shape_empty_4 + call check_int_alloc_rank_1 + call check_int_alloc_rank_3 + call check_int_alloc_empty_4 + call check_real_const_shape_rank_1 + call check_real_const_shape_rank_3 + call check_real_const_shape_empty_4 + call check_real_alloc_rank_1 + call check_real_alloc_rank_3 + call check_real_alloc_empty_4 + call check_int_lower_bounds + call check_real_lower_bounds + call check_all_nans + call check_dependencies +contains + subroutine check_int_const_shape_rank_1() + integer :: a(5) + integer, allocatable :: m(:) + a = data5 + m = maxloc(a) + if (size(m, dim=1) /= 1) stop 11 + if (any(m /= (/ 2 /))) stop 12 + end subroutine + subroutine check_int_const_shape_rank_3() + integer :: a(4,4,4) + integer, allocatable :: m(:) + a = reshape(data64, shape(a)) + m = maxloc(a) + if (size(m, dim=1) /= 3) stop 21 + if (any(m /= (/ 2, 2, 1 /))) stop 22 + end subroutine + subroutine check_int_const_shape_empty_4() + integer :: a(9,3,0,7) + integer, allocatable :: m(:) + a = reshape((/ integer:: /), shape(a)) + m = maxloc(a) + if (size(m, dim=1) /= 4) stop 31 + if (any(m /= (/ 0, 0, 0, 0 /))) stop 32 + end subroutine + subroutine check_int_alloc_rank_1() + integer, allocatable :: a(:) + integer, allocatable :: m(:) + allocate(a(5)) + a(:) = data5 + m = maxloc(a) + if (size(m, dim=1) /= 1) stop 41 + if (any(m /= (/ 2 /))) stop 42 + end subroutine + subroutine check_int_alloc_rank_3() + integer, allocatable :: a(:,:,:) + integer, allocatable :: m(:) + allocate(a(4,4,4)) + a(:,:,:) = reshape(data64, shape(a)) + m = maxloc(a) + if (size(m, dim=1) /= 3) stop 51 + if (any(m /= (/ 2, 2, 1 /))) stop 52 + end subroutine + subroutine check_int_alloc_empty_4() + integer, allocatable :: a(:,:,:,:) + integer, allocatable :: m(:) + allocate(a(9,3,0,7)) + a(:,:,:,:) = reshape((/ integer:: /), shape(a)) + m = maxloc(a) + if (size(m, dim=1) /= 4) stop 61 + if (any(m /= (/ 0, 0, 0, 0 /))) stop 62 + end subroutine + subroutine check_real_const_shape_rank_1() + real :: a(5) + integer, allocatable :: m(:) + a = (/ real:: data5 /) + m = maxloc(a) + if (size(m, dim=1) /= 1) stop 71 + if (any(m /= (/ 2 /))) stop 72 + end subroutine + subroutine check_real_const_shape_rank_3() + real :: a(4,4,4) + integer, allocatable :: m(:) + a = reshape((/ real:: data64 /), shape(a)) + m = maxloc(a) + if (size(m, dim=1) /= 3) stop 81 + if (any(m /= (/ 2, 2, 1 /))) stop 82 + end subroutine + subroutine check_real_const_shape_empty_4() + real :: a(9,3,0,7) + integer, allocatable :: m(:) + a = reshape((/ real:: /), shape(a)) + m = maxloc(a) + if (size(m, dim=1) /= 4) stop 91 + if (any(m /= (/ 0, 0, 0, 0 /))) stop 92 + end subroutine + subroutine check_real_alloc_rank_1() + real, allocatable :: a(:) + integer, allocatable :: m(:) + allocate(a(5)) + a(:) = (/ real:: data5 /) + m = maxloc(a) + if (size(m, dim=1) /= 1) stop 111 + if (any(m /= (/ 2 /))) stop 112 + end subroutine + subroutine check_real_alloc_rank_3() + real, allocatable :: a(:,:,:) + integer, allocatable :: m(:) + allocate(a(4,4,4)) + a(:,:,:) = reshape((/ real:: data64 /), shape(a)) + m = maxloc(a) + if (size(m, dim=1) /= 3) stop 121 + if (any(m /= (/ 2, 2, 1 /))) stop 122 + end subroutine + subroutine check_real_alloc_empty_4() + real, allocatable :: a(:,:,:,:) + integer, allocatable :: m(:) + allocate(a(9,3,0,7)) + a(:,:,:,:) = reshape((/ real:: /), shape(a)) + m = maxloc(a) + if (size(m, dim=1) /= 4) stop 131 + if (any(m /= (/ 0, 0, 0, 0 /))) stop 132 + end subroutine + subroutine check_int_lower_bounds() + integer, allocatable :: a(:,:,:) + integer, allocatable :: m(:) + allocate(a(3:6,-1:2,4)) + a(:,:,:) = reshape(data64, shape(a)) + m = maxloc(a) + if (size(m, dim=1) /= 3) stop 141 + if (any(m /= (/ 2, 2, 1 /))) stop 142 + end subroutine + subroutine check_real_lower_bounds() + real, allocatable :: a(:,:,:) + integer, allocatable :: m(:) + allocate(a(3:6,-1:2,4)) + a(:,:,:) = reshape((/ real:: data64 /), shape(a)) + m = maxloc(a) + if (size(m, dim=1) /= 3) stop 151 + if (any(m /= (/ 2, 2, 1 /))) stop 152 + end subroutine + subroutine check_all_nans() + real, allocatable :: a(:,:,:) + real :: nan + integer, allocatable :: m(:) + nan = 0 + nan = nan / nan + allocate(a(3,3,3), source = nan) + m = maxloc(a) + if (size(m, dim=1) /= 3) stop 161 + if (any(m /= (/ 1, 1, 1 /))) stop 162 + end subroutine + elemental subroutine set(o, i) + integer, intent(out) :: o + integer, intent(in) :: i + o = i + end subroutine + subroutine check_dependencies() + integer, allocatable :: a(:,:,:) + allocate(a(3,3,3)) + ! Direct assignment + a(:,:,:) = reshape(data64(1:27), shape(a)) + a(1,1,:) = maxloc(a) + if (any(a(1,1,:) /= (/ 3, 2, 1 /))) stop 171 + a(:,:,:) = reshape(data64(2:28), shape(a)) + a(3,3,:) = maxloc(a) + if (any(a(3,3,:) /= (/ 2, 2, 1 /))) stop 172 + a(:,:,:) = reshape(data64(3:29), shape(a)) + a(1,:,1) = maxloc(a) + if (any(a(1,:,1) /= (/ 1, 2, 1 /))) stop 173 + a(:,:,:) = reshape(data64(5:31), shape(a)) + a(2,:,2) = maxloc(a) + if (any(a(2,:,2) /= (/ 2, 1, 1 /))) stop 174 + a(:,:,:) = reshape(data64(6:32), shape(a)) + a(3,:,3) = maxloc(a) + if (any(a(3,:,3) /= (/ 1, 1, 1 /))) stop 175 + a(:,:,:) = reshape(data64(7:33), shape(a)) + a(:,1,1) = maxloc(a) + if (any(a(:,1,1) /= (/ 3, 2, 2 /))) stop 176 + a(:,:,:) = reshape(data64(8:34), shape(a)) + a(:,3,3) = maxloc(a) + if (any(a(:,3,3) /= (/ 2, 2, 2 /))) stop 177 + ! Subroutine assignment + a(:,:,:) = reshape(data64(9:35), shape(a)) + call set(a(1,1,:), maxloc(a)) + if (any(a(1,1,:) /= (/ 1, 2, 2 /))) stop 181 + a(:,:,:) = reshape(data64(10:36), shape(a)) + call set(a(3,3,:), maxloc(a)) + if (any(a(3,3,:) /= (/ 3, 1, 2 /))) stop 182 + a(:,:,:) = reshape(data64(11:37), shape(a)) + call set(a(1,:,1), maxloc(a)) + if (any(a(1,:,1) /= (/ 2, 1, 2 /))) stop 183 + a(:,:,:) = reshape(data64(12:38), shape(a)) + call set(a(2,:,2), maxloc(a)) + if (any(a(2,:,2) /= (/ 1, 1, 2 /))) stop 184 + a(:,:,:) = reshape(data64(13:39), shape(a)) + call set(a(3,:,3), maxloc(a)) + if (any(a(3,:,3) /= (/ 3, 3, 1 /))) stop 185 + a(:,:,:) = reshape(data64(14:40), shape(a)) + call set(a(:,1,1), maxloc(a)) + if (any(a(:,1,1) /= (/ 2, 3, 1 /))) stop 186 + a(:,:,:) = reshape(data64(15:41), shape(a)) + call set(a(:,3,3), maxloc(a)) + if (any(a(:,3,3) /= (/ 1, 3, 1 /))) stop 187 + call set(a(1,:,:), maxloc(a, dim=1)) + end subroutine check_dependencies +end program p diff --git a/gcc/testsuite/gfortran.dg/maxloc_with_mask_1.f90 b/gcc/testsuite/gfortran.dg/maxloc_with_mask_1.f90 new file mode 100644 index 00000000000..d029017e38c --- /dev/null +++ b/gcc/testsuite/gfortran.dg/maxloc_with_mask_1.f90 @@ -0,0 +1,393 @@ +! { dg-do run } +! +! PR fortran/90608 +! Check the correct behaviour of the inline maxloc implementation, +! when there is a mask argument. + +program p + implicit none + integer, parameter :: data10(*) = (/ 7, 4, 7, 6, 6, 4, 6, 3, 9, 8 /) + logical, parameter :: mask10(*) = (/ .false., .true., .false., & + .false., .true., .true., & + .true. , .true., .false., & + .false. /) + integer, parameter :: data64(*) = (/ 2, 5, 4, 6, 0, 9, 3, 5, & + 4, 4, 1, 7, 3, 2, 1, 2, & + 5, 4, 6, 0, 9, 3, 5, 4, & + 4, 1, 7, 3, 2, 1, 2, 5, & + 4, 6, 0, 9, 3, 5, 4, 4, & + 1, 7, 3, 2, 1, 2, 5, 4, & + 6, 0, 9, 3, 5, 4, 4, 1, & + 7, 3, 2, 1, 2, 5, 4, 6 /) + logical, parameter :: mask64(*) = (/ .true. , .false., .false., .false., & + .true. , .false., .true. , .false., & + .false., .true. , .true. , .false., & + .true. , .true. , .true. , .true. , & + .false., .true. , .false., .true. , & + .false., .true. , .false., .true. , & + .true. , .false., .false., .true. , & + .true. , .true. , .true. , .false., & + .false., .false., .true. , .false., & + .true. , .false., .true. , .true. , & + .true. , .false., .true. , .true. , & + .false., .true. , .false., .true. , & + .false., .true. , .false., .false., & + .false., .true. , .true. , .true. , & + .false., .true. , .false., .true. , & + .true. , .false., .false., .false. /) + call check_int_const_shape_rank_1 + call check_int_const_shape_rank_3 + call check_int_const_shape_rank_3_true_mask + call check_int_const_shape_rank_3_false_mask + call check_int_const_shape_rank_3_optional_mask_present + call check_int_const_shape_rank_3_optional_mask_absent + call check_int_const_shape_empty_4 + call check_int_alloc_rank_1 + call check_int_alloc_rank_3 + call check_int_alloc_rank_3_true_mask + call check_int_alloc_rank_3_false_mask + call check_int_alloc_empty_4 + call check_real_const_shape_rank_1 + call check_real_const_shape_rank_3 + call check_real_const_shape_rank_3_true_mask + call check_real_const_shape_rank_3_false_mask + call check_real_const_shape_rank_3_optional_mask_present + call check_real_const_shape_rank_3_optional_mask_absent + call check_real_const_shape_empty_4 + call check_real_alloc_rank_1 + call check_real_alloc_rank_3 + call check_real_alloc_rank_3_true_mask + call check_real_alloc_rank_3_false_mask + call check_real_alloc_empty_4 + call check_lower_bounds + call check_all_nans + call check_dependencies +contains + subroutine check_int_const_shape_rank_1() + integer :: a(10) + logical :: m(10) + integer, allocatable :: r(:) + a = data10 + m = mask10 + r = maxloc(a, mask = m) + if (size(r, dim = 1) /= 1) stop 11 + if (any(r /= (/ 5 /))) stop 12 + end subroutine + subroutine check_int_const_shape_rank_3() + integer :: a(4,4,4) + logical :: m(4,4,4) + integer, allocatable :: r(:) + a = reshape(data64, shape(a)) + m = reshape(mask64, shape(m)) + r = maxloc(a, mask = m) + if (size(r, dim = 1) /= 3) stop 21 + if (any(r /= (/ 2, 3, 1 /))) stop 22 + end subroutine + subroutine check_int_const_shape_rank_3_true_mask() + integer :: a(4,4,4) + integer, allocatable :: r(:) + a = reshape(data64, shape(a)) + r = maxloc(a, mask = .true.) + if (size(r, dim = 1) /= 3) stop 31 + if (any(r /= (/ 2, 2, 1 /))) stop 32 + end subroutine + subroutine check_int_const_shape_rank_3_false_mask() + integer :: a(4,4,4) + integer, allocatable :: r(:) + a = reshape(data64, shape(a)) + r = maxloc(a, mask = .false.) + if (size(r, dim = 1) /= 3) stop 41 + if (any(r /= (/ 0, 0, 0 /))) stop 42 + end subroutine + subroutine call_maxloc_int(r, a, m) + integer :: a(:,:,:) + logical, optional :: m(:,:,:) + integer, allocatable :: r(:) + r = maxloc(a, mask = m) + end subroutine + subroutine check_int_const_shape_rank_3_optional_mask_present() + integer :: a(4,4,4) + logical :: m(4,4,4) + integer, allocatable :: r(:) + a = reshape(data64, shape(a)) + m = reshape(mask64, shape(m)) + call call_maxloc_int(r, a, m) + if (size(r, dim = 1) /= 3) stop 51 + if (any(r /= (/ 2, 3, 1 /))) stop 52 + end subroutine + subroutine check_int_const_shape_rank_3_optional_mask_absent() + integer :: a(4,4,4) + integer, allocatable :: r(:) + a = reshape(data64, shape(a)) + call call_maxloc_int(r, a) + if (size(r, dim = 1) /= 3) stop 61 + if (any(r /= (/ 2, 2, 1 /))) stop 62 + end subroutine + subroutine check_int_const_shape_empty_4() + integer :: a(9,3,0,7) + logical :: m(9,3,0,7) + integer, allocatable :: r(:) + a = reshape((/ integer:: /), shape(a)) + m = reshape((/ logical:: /), shape(m)) + r = maxloc(a, mask = m) + if (size(r, dim = 1) /= 4) stop 71 + if (any(r /= (/ 0, 0, 0, 0 /))) stop 72 + end subroutine + subroutine check_int_alloc_rank_1() + integer, allocatable :: a(:) + logical, allocatable :: m(:) + integer, allocatable :: r(:) + allocate(a(10), m(10)) + a(:) = data10 + m(:) = mask10 + r = maxloc(a, mask = m) + if (size(r, dim = 1) /= 1) stop 81 + if (any(r /= (/ 5 /))) stop 82 + end subroutine + subroutine check_int_alloc_rank_3() + integer, allocatable :: a(:,:,:) + logical, allocatable :: m(:,:,:) + integer, allocatable :: r(:) + allocate(a(4,4,4), m(4,4,4)) + a(:,:,:) = reshape(data64, shape(a)) + m(:,:,:) = reshape(mask64, shape(m)) + r = maxloc(a, mask = m) + if (size(r, dim = 1) /= 3) stop 91 + if (any(r /= (/ 2, 3, 1 /))) stop 92 + end subroutine + subroutine check_int_alloc_rank_3_true_mask() + integer, allocatable :: a(:,:,:) + integer, allocatable :: r(:) + allocate(a(4,4,4)) + a(:,:,:) = reshape(data64, shape(a)) + r = maxloc(a, mask = .true.) + if (size(r, dim = 1) /= 3) stop 101 + if (any(r /= (/ 2, 2, 1 /))) stop 102 + end subroutine + subroutine check_int_alloc_rank_3_false_mask() + integer, allocatable :: a(:,:,:) + integer, allocatable :: r(:) + allocate(a(4,4,4)) + a(:,:,:) = reshape(data64, shape(a)) + r = maxloc(a, mask = .false.) + if (size(r, dim = 1) /= 3) stop 111 + if (any(r /= (/ 0, 0, 0 /))) stop 112 + end subroutine + subroutine check_int_alloc_empty_4() + integer, allocatable :: a(:,:,:,:) + logical, allocatable :: m(:,:,:,:) + integer, allocatable :: r(:) + allocate(a(9,3,0,7), m(9,3,0,7)) + a(:,:,:,:) = reshape((/ integer:: /), shape(a)) + m(:,:,:,:) = reshape((/ logical:: /), shape(m)) + r = maxloc(a, mask = m) + if (size(r, dim = 1) /= 4) stop 121 + if (any(r /= (/ 0, 0, 0, 0 /))) stop 122 + end subroutine + subroutine check_real_const_shape_rank_1() + real :: a(10) + logical :: m(10) + integer, allocatable :: r(:) + a = (/ real:: data10 /) + m = mask10 + r = maxloc(a, mask = m) + if (size(r, dim = 1) /= 1) stop 131 + if (any(r /= (/ 5 /))) stop 132 + end subroutine + subroutine check_real_const_shape_rank_3() + real :: a(4,4,4) + logical :: m(4,4,4) + integer, allocatable :: r(:) + a = reshape((/ real:: data64 /), shape(a)) + m = reshape(mask64, shape(m)) + r = maxloc(a, mask = m) + if (size(r, dim = 1) /= 3) stop 141 + if (any(r /= (/ 2, 3, 1 /))) stop 142 + end subroutine + subroutine check_real_const_shape_rank_3_true_mask() + real :: a(4,4,4) + integer, allocatable :: r(:) + a = reshape((/ real:: data64 /), shape(a)) + r = maxloc(a, mask = .true.) + if (size(r, dim = 1) /= 3) stop 151 + if (any(r /= (/ 2, 2, 1 /))) stop 152 + end subroutine + subroutine check_real_const_shape_rank_3_false_mask() + real :: a(4,4,4) + integer, allocatable :: r(:) + a = reshape((/ real:: data64 /), shape(a)) + r = maxloc(a, mask = .false.) + if (size(r, dim = 1) /= 3) stop 161 + if (any(r /= (/ 0, 0, 0 /))) stop 162 + end subroutine + subroutine call_maxloc_real(r, a, m) + real :: a(:,:,:) + logical, optional :: m(:,:,:) + integer, allocatable :: r(:) + r = maxloc(a, mask = m) + end subroutine + subroutine check_real_const_shape_rank_3_optional_mask_present() + real :: a(4,4,4) + logical :: m(4,4,4) + integer, allocatable :: r(:) + a = reshape((/ real:: data64 /), shape(a)) + m = reshape(mask64, shape(m)) + call call_maxloc_real(r, a, m) + if (size(r, dim = 1) /= 3) stop 171 + if (any(r /= (/ 2, 3, 1 /))) stop 172 + end subroutine + subroutine check_real_const_shape_rank_3_optional_mask_absent() + real :: a(4,4,4) + integer, allocatable :: r(:) + a = reshape((/ real:: data64 /), shape(a)) + call call_maxloc_real(r, a) + if (size(r, dim = 1) /= 3) stop 181 + if (any(r /= (/ 2, 2, 1 /))) stop 182 + end subroutine + subroutine check_real_const_shape_empty_4() + real :: a(9,3,0,7) + logical :: m(9,3,0,7) + integer, allocatable :: r(:) + a = reshape((/ real:: /), shape(a)) + m = reshape((/ logical:: /), shape(m)) + r = maxloc(a, mask = m) + if (size(r, dim = 1) /= 4) stop 191 + if (any(r /= (/ 0, 0, 0, 0 /))) stop 192 + end subroutine + subroutine check_real_alloc_rank_1() + real, allocatable :: a(:) + logical, allocatable :: m(:) + integer, allocatable :: r(:) + allocate(a(10), m(10)) + a(:) = (/ real:: data10 /) + m(:) = mask10 + r = maxloc(a, mask = m) + if (size(r, dim = 1) /= 1) stop 201 + if (any(r /= (/ 5 /))) stop 202 + end subroutine + subroutine check_real_alloc_rank_3() + real, allocatable :: a(:,:,:) + logical, allocatable :: m(:,:,:) + integer, allocatable :: r(:) + allocate(a(4,4,4), m(4,4,4)) + a(:,:,:) = reshape((/ real:: data64 /), shape(a)) + m(:,:,:) = reshape(mask64, shape(m)) + r = maxloc(a, mask = m) + if (size(r, dim = 1) /= 3) stop 211 + if (any(r /= (/ 2, 3, 1 /))) stop 212 + end subroutine + subroutine check_real_alloc_rank_3_true_mask() + real, allocatable :: a(:,:,:) + integer, allocatable :: r(:) + allocate(a(4,4,4)) + a(:,:,:) = reshape((/ real:: data64 /), shape(a)) + r = maxloc(a, mask = .true.) + if (size(r, dim = 1) /= 3) stop 221 + if (any(r /= (/ 2, 2, 1 /))) stop 222 + end subroutine + subroutine check_real_alloc_rank_3_false_mask() + real, allocatable :: a(:,:,:) + integer, allocatable :: r(:) + allocate(a(4,4,4)) + a(:,:,:) = reshape((/ real:: data64 /), shape(a)) + r = maxloc(a, mask = .false.) + if (size(r, dim = 1) /= 3) stop 231 + if (any(r /= (/ 0, 0, 0 /))) stop 232 + end subroutine + subroutine check_real_alloc_empty_4() + real, allocatable :: a(:,:,:,:) + logical, allocatable :: m(:,:,:,:) + integer, allocatable :: r(:) + allocate(a(9,3,0,7), m(9,3,0,7)) + a(:,:,:,:) = reshape((/ real:: /), shape(a)) + m(:,:,:,:) = reshape((/ logical :: /), shape(m)) + r = maxloc(a, mask = m) + if (size(r, dim = 1) /= 4) stop 241 + if (any(r /= (/ 0, 0, 0, 0 /))) stop 242 + end subroutine + subroutine check_lower_bounds() + real, allocatable :: a(:,:,:) + logical, allocatable :: m(:,:,:) + integer, allocatable :: r(:) + allocate(a(3:6,-1:2,4), m(3:6,-1:2,4)) + a(:,:,:) = reshape((/ real:: data64 /), shape(a)) + m = reshape(mask64, shape(m)) + r = maxloc(a, mask = m) + if (size(r, dim = 1) /= 3) stop 251 + if (any(r /= (/ 2, 3, 1 /))) stop 252 + end subroutine + subroutine check_all_nans() + real, allocatable :: a(:,:,:) + logical, allocatable :: m(:,:,:) + real :: nan + integer, allocatable :: r(:) + nan = 0 + nan = nan / nan + allocate(a(3,3,3), source = nan) + allocate(m(3,3,3)) + m(:,:,:) = reshape((/ .false., .false., .true. , .true. , .false., & + .true. , .false., .false., .false., .true. , & + .true. , .false., .true. , .true. , .true. , & + .false., .false., .true. , .true. , .false., & + .false., .true. , .false., .false., .true. , & + .true. , .true. /), shape(m)) + r = maxloc(a, mask = m) + if (size(r, dim = 1) /= 3) stop 261 + if (any(r /= (/ 3, 1, 1 /))) stop 262 + end subroutine + elemental subroutine set(o, i) + integer, intent(out) :: o + integer, intent(in) :: i + o = i + end subroutine + subroutine check_dependencies() + integer, allocatable :: a(:,:,:) + allocate(a(3,3,3)) + ! Direct assignment + a(:,:,:) = reshape(data64(1:27), shape(a)) + a(1,1,:) = maxloc(a, mask=a>1) + print *, a(1,1,:) + if (any(a(1,1,:) /= (/ 3, 2, 1 /))) stop 171 + a(:,:,:) = reshape(data64(2:28), shape(a)) + a(3,3,:) = maxloc(a, mask=a>1) + if (any(a(3,3,:) /= (/ 2, 2, 1 /))) stop 172 + a(:,:,:) = reshape(data64(3:29), shape(a)) + a(1,:,1) = maxloc(a, mask=a>1) + if (any(a(1,:,1) /= (/ 1, 2, 1 /))) stop 173 + a(:,:,:) = reshape(data64(5:31), shape(a)) + a(2,:,2) = maxloc(a, mask=a>1) + if (any(a(2,:,2) /= (/ 2, 1, 1 /))) stop 174 + a(:,:,:) = reshape(data64(6:32), shape(a)) + a(3,:,3) = maxloc(a, mask=a>1) + if (any(a(3,:,3) /= (/ 1, 1, 1 /))) stop 175 + a(:,:,:) = reshape(data64(7:33), shape(a)) + a(:,1,1) = maxloc(a, mask=a>1) + if (any(a(:,1,1) /= (/ 3, 2, 2 /))) stop 176 + a(:,:,:) = reshape(data64(8:34), shape(a)) + a(:,3,3) = maxloc(a, mask=a>1) + if (any(a(:,3,3) /= (/ 2, 2, 2 /))) stop 177 + ! Subroutine assignment + a(:,:,:) = reshape(data64(9:35), shape(a)) + call set(a(1,1,:), maxloc(a, mask=a>1)) + if (any(a(1,1,:) /= (/ 1, 2, 2 /))) stop 181 + a(:,:,:) = reshape(data64(10:36), shape(a)) + call set(a(3,3,:), maxloc(a, mask=a>1)) + if (any(a(3,3,:) /= (/ 3, 1, 2 /))) stop 182 + a(:,:,:) = reshape(data64(11:37), shape(a)) + call set(a(1,:,1), maxloc(a, mask=a>1)) + if (any(a(1,:,1) /= (/ 2, 1, 2 /))) stop 183 + a(:,:,:) = reshape(data64(12:38), shape(a)) + call set(a(2,:,2), maxloc(a, mask=a>1)) + if (any(a(2,:,2) /= (/ 1, 1, 2 /))) stop 184 + a(:,:,:) = reshape(data64(13:39), shape(a)) + call set(a(3,:,3), maxloc(a, mask=a>1)) + if (any(a(3,:,3) /= (/ 3, 3, 1 /))) stop 185 + a(:,:,:) = reshape(data64(14:40), shape(a)) + call set(a(:,1,1), maxloc(a, mask=a>1)) + if (any(a(:,1,1) /= (/ 2, 3, 1 /))) stop 186 + a(:,:,:) = reshape(data64(15:41), shape(a)) + call set(a(:,3,3), maxloc(a, mask=a>1)) + if (any(a(:,3,3) /= (/ 1, 3, 1 /))) stop 187 + call set(a(1,:,:), maxloc(a, dim=1)) + end subroutine check_dependencies +end program p diff --git a/gcc/testsuite/gfortran.dg/minloc_8.f90 b/gcc/testsuite/gfortran.dg/minloc_8.f90 new file mode 100644 index 00000000000..ebbb7829afe --- /dev/null +++ b/gcc/testsuite/gfortran.dg/minloc_8.f90 @@ -0,0 +1,220 @@ +! { dg-do run } +! +! PR fortran/90608 +! Check the correct behaviour of the inline minloc implementation, +! when there is no optional argument. + +program p + implicit none + integer, parameter :: data5(*) = (/ 8, 2, 7, 2, 9 /) + integer, parameter :: data64(*) = (/ 7, 4, 5, 3, 9, 0, 6, 4, & + 5, 5, 8, 2, 6, 7, 8, 7, & + 4, 5, 3, 9, 0, 6, 4, 5, & + 5, 8, 2, 6, 7, 8, 7, 4, & + 5, 3, 9, 0, 6, 4, 5, 5, & + 8, 2, 6, 7, 8, 7, 4, 5, & + 3, 9, 0, 6, 4, 5, 5, 8, & + 2, 6, 7, 8, 7, 4, 5, 3 /) + call check_int_const_shape_rank_1 + call check_int_const_shape_rank_3 + call check_int_const_shape_empty_4 + call check_int_alloc_rank_1 + call check_int_alloc_rank_3 + call check_int_alloc_empty_4 + call check_real_const_shape_rank_1 + call check_real_const_shape_rank_3 + call check_real_const_shape_empty_4 + call check_real_alloc_rank_1 + call check_real_alloc_rank_3 + call check_real_alloc_empty_4 + call check_int_lower_bounds + call check_real_lower_bounds + call check_all_nans + call check_dependencies +contains + subroutine check_int_const_shape_rank_1() + integer :: a(5) + integer, allocatable :: m(:) + a = data5 + m = minloc(a) + if (size(m, dim=1) /= 1) stop 11 + if (any(m /= (/ 2 /))) stop 12 + end subroutine + subroutine check_int_const_shape_rank_3() + integer :: a(4,4,4) + integer, allocatable :: m(:) + a = reshape(data64, shape(a)) + m = minloc(a) + if (size(m, dim=1) /= 3) stop 21 + if (any(m /= (/ 2, 2, 1 /))) stop 22 + end subroutine + subroutine check_int_const_shape_empty_4() + integer :: a(9,3,0,7) + integer, allocatable :: m(:) + a = reshape((/ integer:: /), shape(a)) + m = minloc(a) + if (size(m, dim=1) /= 4) stop 31 + if (any(m /= (/ 0, 0, 0, 0 /))) stop 32 + end subroutine + subroutine check_int_alloc_rank_1() + integer, allocatable :: a(:) + integer, allocatable :: m(:) + allocate(a(5)) + a(:) = data5 + m = minloc(a) + if (size(m, dim=1) /= 1) stop 41 + if (any(m /= (/ 2 /))) stop 42 + end subroutine + subroutine check_int_alloc_rank_3() + integer, allocatable :: a(:,:,:) + integer, allocatable :: m(:) + allocate(a(4,4,4)) + a(:,:,:) = reshape(data64, shape(a)) + m = minloc(a) + if (size(m, dim=1) /= 3) stop 51 + if (any(m /= (/ 2, 2, 1 /))) stop 52 + end subroutine + subroutine check_int_alloc_empty_4() + integer, allocatable :: a(:,:,:,:) + integer, allocatable :: m(:) + allocate(a(9,3,0,7)) + a(:,:,:,:) = reshape((/ integer:: /), shape(a)) + m = minloc(a) + if (size(m, dim=1) /= 4) stop 61 + if (any(m /= (/ 0, 0, 0, 0 /))) stop 62 + end subroutine + subroutine check_real_const_shape_rank_1() + real :: a(5) + integer, allocatable :: m(:) + a = (/ real:: data5 /) + m = minloc(a) + if (size(m, dim=1) /= 1) stop 71 + if (any(m /= (/ 2 /))) stop 72 + end subroutine + subroutine check_real_const_shape_rank_3() + real :: a(4,4,4) + integer, allocatable :: m(:) + a = reshape((/ real:: data64 /), shape(a)) + m = minloc(a) + if (size(m, dim=1) /= 3) stop 81 + if (any(m /= (/ 2, 2, 1 /))) stop 82 + end subroutine + subroutine check_real_const_shape_empty_4() + real :: a(9,3,0,7) + integer, allocatable :: m(:) + a = reshape((/ real:: /), shape(a)) + m = minloc(a) + if (size(m, dim=1) /= 4) stop 91 + if (any(m /= (/ 0, 0, 0, 0 /))) stop 92 + end subroutine + subroutine check_real_alloc_rank_1() + real, allocatable :: a(:) + integer, allocatable :: m(:) + allocate(a(5)) + a(:) = (/ real:: data5 /) + m = minloc(a) + if (size(m, dim=1) /= 1) stop 111 + if (any(m /= (/ 2 /))) stop 112 + end subroutine + subroutine check_real_alloc_rank_3() + real, allocatable :: a(:,:,:) + integer, allocatable :: m(:) + allocate(a(4,4,4)) + a(:,:,:) = reshape((/ real:: data64 /), shape(a)) + m = minloc(a) + if (size(m, dim=1) /= 3) stop 121 + if (any(m /= (/ 2, 2, 1 /))) stop 122 + end subroutine + subroutine check_real_alloc_empty_4() + real, allocatable :: a(:,:,:,:) + integer, allocatable :: m(:) + allocate(a(9,3,0,7)) + a(:,:,:,:) = reshape((/ real:: /), shape(a)) + m = minloc(a) + if (size(m, dim=1) /= 4) stop 131 + if (any(m /= (/ 0, 0, 0, 0 /))) stop 132 + end subroutine + subroutine check_int_lower_bounds() + integer, allocatable :: a(:,:,:) + integer, allocatable :: m(:) + allocate(a(3:6,-1:2,4)) + a(:,:,:) = reshape(data64, shape(a)) + m = minloc(a) + if (size(m, dim=1) /= 3) stop 141 + if (any(m /= (/ 2, 2, 1 /))) stop 142 + end subroutine + subroutine check_real_lower_bounds() + real, allocatable :: a(:,:,:) + integer, allocatable :: m(:) + allocate(a(3:6,-1:2,4)) + a(:,:,:) = reshape((/ real:: data64 /), shape(a)) + m = minloc(a) + if (size(m, dim=1) /= 3) stop 151 + if (any(m /= (/ 2, 2, 1 /))) stop 152 + end subroutine + subroutine check_all_nans() + real, allocatable :: a(:,:,:) + real :: nan + integer, allocatable :: m(:) + nan = 0 + nan = nan / nan + allocate(a(3,3,3), source = nan) + m = minloc(a) + if (size(m, dim=1) /= 3) stop 161 + if (any(m /= (/ 1, 1, 1 /))) stop 162 + end subroutine + elemental subroutine set(o, i) + integer, intent(out) :: o + integer, intent(in) :: i + o = i + end subroutine + subroutine check_dependencies() + integer, allocatable :: a(:,:,:) + allocate(a(3,3,3)) + ! Direct assignment + a(:,:,:) = reshape(data64(1:27), shape(a)) + a(1,1,:) = minloc(a) + if (any(a(1,1,:) /= (/ 3, 2, 1 /))) stop 171 + a(:,:,:) = reshape(data64(2:28), shape(a)) + a(3,3,:) = minloc(a) + if (any(a(3,3,:) /= (/ 2, 2, 1 /))) stop 172 + a(:,:,:) = reshape(data64(3:29), shape(a)) + a(1,:,1) = minloc(a) + if (any(a(1,:,1) /= (/ 1, 2, 1 /))) stop 173 + a(:,:,:) = reshape(data64(5:31), shape(a)) + a(2,:,2) = minloc(a) + if (any(a(2,:,2) /= (/ 2, 1, 1 /))) stop 174 + a(:,:,:) = reshape(data64(6:32), shape(a)) + a(3,:,3) = minloc(a) + if (any(a(3,:,3) /= (/ 1, 1, 1 /))) stop 175 + a(:,:,:) = reshape(data64(7:33), shape(a)) + a(:,1,1) = minloc(a) + if (any(a(:,1,1) /= (/ 3, 2, 2 /))) stop 176 + a(:,:,:) = reshape(data64(8:34), shape(a)) + a(:,3,3) = minloc(a) + if (any(a(:,3,3) /= (/ 2, 2, 2 /))) stop 177 + ! Subroutine assignment + a(:,:,:) = reshape(data64(9:35), shape(a)) + call set(a(1,1,:), minloc(a)) + if (any(a(1,1,:) /= (/ 1, 2, 2 /))) stop 181 + a(:,:,:) = reshape(data64(10:36), shape(a)) + call set(a(3,3,:), minloc(a)) + if (any(a(3,3,:) /= (/ 3, 1, 2 /))) stop 182 + a(:,:,:) = reshape(data64(11:37), shape(a)) + call set(a(1,:,1), minloc(a)) + if (any(a(1,:,1) /= (/ 2, 1, 2 /))) stop 183 + a(:,:,:) = reshape(data64(12:38), shape(a)) + call set(a(2,:,2), minloc(a)) + if (any(a(2,:,2) /= (/ 1, 1, 2 /))) stop 184 + a(:,:,:) = reshape(data64(13:39), shape(a)) + call set(a(3,:,3), minloc(a)) + if (any(a(3,:,3) /= (/ 3, 3, 1 /))) stop 185 + a(:,:,:) = reshape(data64(14:40), shape(a)) + call set(a(:,1,1), minloc(a)) + if (any(a(:,1,1) /= (/ 2, 3, 1 /))) stop 186 + a(:,:,:) = reshape(data64(15:41), shape(a)) + call set(a(:,3,3), minloc(a)) + if (any(a(:,3,3) /= (/ 1, 3, 1 /))) stop 187 + call set(a(1,:,:), minloc(a, dim=1)) + end subroutine check_dependencies +end program p diff --git a/gcc/testsuite/gfortran.dg/minloc_with_mask_1.f90 b/gcc/testsuite/gfortran.dg/minloc_with_mask_1.f90 new file mode 100644 index 00000000000..5b3498bb23d --- /dev/null +++ b/gcc/testsuite/gfortran.dg/minloc_with_mask_1.f90 @@ -0,0 +1,392 @@ +! { dg-do run } +! +! PR fortran/90608 +! Check the correct behaviour of the inline minloc implementation, +! when there is a mask argument. + +program p + implicit none + integer, parameter :: data10(*) = (/ 2, 5, 2, 3, 3, 5, 3, 6, 0, 1 /) + logical, parameter :: mask10(*) = (/ .false., .true., .false., & + .false., .true., .true., & + .true. , .true., .false., & + .false. /) + integer, parameter :: data64(*) = (/ 7, 4, 5, 3, 9, 0, 6, 4, & + 5, 5, 8, 2, 6, 7, 8, 7, & + 4, 5, 3, 9, 0, 6, 4, 5, & + 5, 8, 2, 6, 7, 8, 7, 4, & + 5, 3, 9, 0, 6, 4, 5, 5, & + 8, 2, 6, 7, 8, 7, 4, 5, & + 3, 9, 0, 6, 4, 5, 5, 8, & + 2, 6, 7, 8, 7, 4, 5, 3 /) + logical, parameter :: mask64(*) = (/ .true. , .false., .false., .false., & + .true. , .false., .true. , .false., & + .false., .true. , .true. , .false., & + .true. , .true. , .true. , .true. , & + .false., .true. , .false., .true. , & + .false., .true. , .false., .true. , & + .true. , .false., .false., .true. , & + .true. , .true. , .true. , .false., & + .false., .false., .true. , .false., & + .true. , .false., .true. , .true. , & + .true. , .false., .true. , .true. , & + .false., .true. , .false., .true. , & + .false., .true. , .false., .false., & + .false., .true. , .true. , .true. , & + .false., .true. , .false., .true. , & + .true. , .false., .false., .false. /) + call check_int_const_shape_rank_1 + call check_int_const_shape_rank_3 + call check_int_const_shape_rank_3_true_mask + call check_int_const_shape_rank_3_false_mask + call check_int_const_shape_rank_3_optional_mask_present + call check_int_const_shape_rank_3_optional_mask_absent + call check_int_const_shape_empty_4 + call check_int_alloc_rank_1 + call check_int_alloc_rank_3 + call check_int_alloc_rank_3_true_mask + call check_int_alloc_rank_3_false_mask + call check_int_alloc_empty_4 + call check_real_const_shape_rank_1 + call check_real_const_shape_rank_3 + call check_real_const_shape_rank_3_true_mask + call check_real_const_shape_rank_3_false_mask + call check_real_const_shape_rank_3_optional_mask_present + call check_real_const_shape_rank_3_optional_mask_absent + call check_real_const_shape_empty_4 + call check_real_alloc_rank_1 + call check_real_alloc_rank_3 + call check_real_alloc_rank_3_true_mask + call check_real_alloc_rank_3_false_mask + call check_real_alloc_empty_4 + call check_lower_bounds + call check_all_nans + call check_dependencies +contains + subroutine check_int_const_shape_rank_1() + integer :: a(10) + logical :: m(10) + integer, allocatable :: r(:) + a = data10 + m = mask10 + r = minloc(a, mask = m) + if (size(r, dim = 1) /= 1) stop 11 + if (any(r /= (/ 5 /))) stop 12 + end subroutine + subroutine check_int_const_shape_rank_3() + integer :: a(4,4,4) + logical :: m(4,4,4) + integer, allocatable :: r(:) + a = reshape(data64, shape(a)) + m = reshape(mask64, shape(m)) + r = minloc(a, mask = m) + if (size(r, dim = 1) /= 3) stop 21 + if (any(r /= (/ 2, 3, 1 /))) stop 22 + end subroutine + subroutine check_int_const_shape_rank_3_true_mask() + integer :: a(4,4,4) + integer, allocatable :: r(:) + a = reshape(data64, shape(a)) + r = minloc(a, mask = .true.) + if (size(r, dim = 1) /= 3) stop 31 + if (any(r /= (/ 2, 2, 1 /))) stop 32 + end subroutine + subroutine check_int_const_shape_rank_3_false_mask() + integer :: a(4,4,4) + integer, allocatable :: r(:) + a = reshape(data64, shape(a)) + r = minloc(a, mask = .false.) + if (size(r, dim = 1) /= 3) stop 41 + if (any(r /= (/ 0, 0, 0 /))) stop 42 + end subroutine + subroutine call_minloc_int(r, a, m) + integer :: a(:,:,:) + logical, optional :: m(:,:,:) + integer, allocatable :: r(:) + r = minloc(a, mask = m) + end subroutine + subroutine check_int_const_shape_rank_3_optional_mask_present() + integer :: a(4,4,4) + logical :: m(4,4,4) + integer, allocatable :: r(:) + a = reshape(data64, shape(a)) + m = reshape(mask64, shape(m)) + call call_minloc_int(r, a, m) + if (size(r, dim = 1) /= 3) stop 51 + if (any(r /= (/ 2, 3, 1 /))) stop 52 + end subroutine + subroutine check_int_const_shape_rank_3_optional_mask_absent() + integer :: a(4,4,4) + integer, allocatable :: r(:) + a = reshape(data64, shape(a)) + call call_minloc_int(r, a) + if (size(r, dim = 1) /= 3) stop 61 + if (any(r /= (/ 2, 2, 1 /))) stop 62 + end subroutine + subroutine check_int_const_shape_empty_4() + integer :: a(9,3,0,7) + logical :: m(9,3,0,7) + integer, allocatable :: r(:) + a = reshape((/ integer:: /), shape(a)) + m = reshape((/ logical:: /), shape(m)) + r = minloc(a, mask = m) + if (size(r, dim = 1) /= 4) stop 71 + if (any(r /= (/ 0, 0, 0, 0 /))) stop 72 + end subroutine + subroutine check_int_alloc_rank_1() + integer, allocatable :: a(:) + logical, allocatable :: m(:) + integer, allocatable :: r(:) + allocate(a(10), m(10)) + a(:) = data10 + m(:) = mask10 + r = minloc(a, mask = m) + if (size(r, dim = 1) /= 1) stop 81 + if (any(r /= (/ 5 /))) stop 82 + end subroutine + subroutine check_int_alloc_rank_3() + integer, allocatable :: a(:,:,:) + logical, allocatable :: m(:,:,:) + integer, allocatable :: r(:) + allocate(a(4,4,4), m(4,4,4)) + a(:,:,:) = reshape(data64, shape(a)) + m(:,:,:) = reshape(mask64, shape(m)) + r = minloc(a, mask = m) + if (size(r, dim = 1) /= 3) stop 91 + if (any(r /= (/ 2, 3, 1 /))) stop 92 + end subroutine + subroutine check_int_alloc_rank_3_true_mask() + integer, allocatable :: a(:,:,:) + integer, allocatable :: r(:) + allocate(a(4,4,4)) + a(:,:,:) = reshape(data64, shape(a)) + r = minloc(a, mask = .true.) + if (size(r, dim = 1) /= 3) stop 101 + if (any(r /= (/ 2, 2, 1 /))) stop 102 + end subroutine + subroutine check_int_alloc_rank_3_false_mask() + integer, allocatable :: a(:,:,:) + integer, allocatable :: r(:) + allocate(a(4,4,4)) + a(:,:,:) = reshape(data64, shape(a)) + r = minloc(a, mask = .false.) + if (size(r, dim = 1) /= 3) stop 111 + if (any(r /= (/ 0, 0, 0 /))) stop 112 + end subroutine + subroutine check_int_alloc_empty_4() + integer, allocatable :: a(:,:,:,:) + logical, allocatable :: m(:,:,:,:) + integer, allocatable :: r(:) + allocate(a(9,3,0,7), m(9,3,0,7)) + a(:,:,:,:) = reshape((/ integer:: /), shape(a)) + m(:,:,:,:) = reshape((/ logical:: /), shape(m)) + r = minloc(a, mask = m) + if (size(r, dim = 1) /= 4) stop 121 + if (any(r /= (/ 0, 0, 0, 0 /))) stop 122 + end subroutine + subroutine check_real_const_shape_rank_1() + real :: a(10) + logical :: m(10) + integer, allocatable :: r(:) + a = (/ real:: data10 /) + m = mask10 + r = minloc(a, mask = m) + if (size(r, dim = 1) /= 1) stop 131 + if (any(r /= (/ 5 /))) stop 132 + end subroutine + subroutine check_real_const_shape_rank_3() + real :: a(4,4,4) + logical :: m(4,4,4) + integer, allocatable :: r(:) + a = reshape((/ real:: data64 /), shape(a)) + m = reshape(mask64, shape(m)) + r = minloc(a, mask = m) + if (size(r, dim = 1) /= 3) stop 141 + if (any(r /= (/ 2, 3, 1 /))) stop 142 + end subroutine + subroutine check_real_const_shape_rank_3_true_mask() + real :: a(4,4,4) + integer, allocatable :: r(:) + a = reshape((/ real:: data64 /), shape(a)) + r = minloc(a, mask = .true.) + if (size(r, dim = 1) /= 3) stop 151 + if (any(r /= (/ 2, 2, 1 /))) stop 152 + end subroutine + subroutine check_real_const_shape_rank_3_false_mask() + real :: a(4,4,4) + integer, allocatable :: r(:) + a = reshape((/ real:: data64 /), shape(a)) + r = minloc(a, mask = .false.) + if (size(r, dim = 1) /= 3) stop 161 + if (any(r /= (/ 0, 0, 0 /))) stop 162 + end subroutine + subroutine call_minloc_real(r, a, m) + real :: a(:,:,:) + logical, optional :: m(:,:,:) + integer, allocatable :: r(:) + r = minloc(a, mask = m) + end subroutine + subroutine check_real_const_shape_rank_3_optional_mask_present() + real :: a(4,4,4) + logical :: m(4,4,4) + integer, allocatable :: r(:) + a = reshape((/ real:: data64 /), shape(a)) + m = reshape(mask64, shape(m)) + call call_minloc_real(r, a, m) + if (size(r, dim = 1) /= 3) stop 171 + if (any(r /= (/ 2, 3, 1 /))) stop 172 + end subroutine + subroutine check_real_const_shape_rank_3_optional_mask_absent() + real :: a(4,4,4) + integer, allocatable :: r(:) + a = reshape((/ real:: data64 /), shape(a)) + call call_minloc_real(r, a) + if (size(r, dim = 1) /= 3) stop 181 + if (any(r /= (/ 2, 2, 1 /))) stop 182 + end subroutine + subroutine check_real_const_shape_empty_4() + real :: a(9,3,0,7) + logical :: m(9,3,0,7) + integer, allocatable :: r(:) + a = reshape((/ real:: /), shape(a)) + m = reshape((/ logical:: /), shape(m)) + r = minloc(a, mask = m) + if (size(r, dim = 1) /= 4) stop 191 + if (any(r /= (/ 0, 0, 0, 0 /))) stop 192 + end subroutine + subroutine check_real_alloc_rank_1() + real, allocatable :: a(:) + logical, allocatable :: m(:) + integer, allocatable :: r(:) + allocate(a(10), m(10)) + a(:) = (/ real:: data10 /) + m(:) = mask10 + r = minloc(a, mask = m) + if (size(r, dim = 1) /= 1) stop 201 + if (any(r /= (/ 5 /))) stop 202 + end subroutine + subroutine check_real_alloc_rank_3() + real, allocatable :: a(:,:,:) + logical, allocatable :: m(:,:,:) + integer, allocatable :: r(:) + allocate(a(4,4,4), m(4,4,4)) + a(:,:,:) = reshape((/ real:: data64 /), shape(a)) + m(:,:,:) = reshape(mask64, shape(m)) + r = minloc(a, mask = m) + if (size(r, dim = 1) /= 3) stop 211 + if (any(r /= (/ 2, 3, 1 /))) stop 212 + end subroutine + subroutine check_real_alloc_rank_3_true_mask() + real, allocatable :: a(:,:,:) + integer, allocatable :: r(:) + allocate(a(4,4,4)) + a(:,:,:) = reshape((/ real:: data64 /), shape(a)) + r = minloc(a, mask = .true.) + if (size(r, dim = 1) /= 3) stop 221 + if (any(r /= (/ 2, 2, 1 /))) stop 222 + end subroutine + subroutine check_real_alloc_rank_3_false_mask() + real, allocatable :: a(:,:,:) + integer, allocatable :: r(:) + allocate(a(4,4,4)) + a(:,:,:) = reshape((/ real:: data64 /), shape(a)) + r = minloc(a, mask = .false.) + if (size(r, dim = 1) /= 3) stop 231 + if (any(r /= (/ 0, 0, 0 /))) stop 232 + end subroutine + subroutine check_real_alloc_empty_4() + real, allocatable :: a(:,:,:,:) + logical, allocatable :: m(:,:,:,:) + integer, allocatable :: r(:) + allocate(a(9,3,0,7), m(9,3,0,7)) + a(:,:,:,:) = reshape((/ real:: /), shape(a)) + m(:,:,:,:) = reshape((/ logical :: /), shape(m)) + r = minloc(a, mask = m) + if (size(r, dim = 1) /= 4) stop 241 + if (any(r /= (/ 0, 0, 0, 0 /))) stop 242 + end subroutine + subroutine check_lower_bounds() + real, allocatable :: a(:,:,:) + logical, allocatable :: m(:,:,:) + integer, allocatable :: r(:) + allocate(a(3:6,-1:2,4), m(3:6,-1:2,4)) + a(:,:,:) = reshape((/ real:: data64 /), shape(a)) + m = reshape(mask64, shape(m)) + r = minloc(a, mask = m) + if (size(r, dim = 1) /= 3) stop 251 + if (any(r /= (/ 2, 3, 1 /))) stop 252 + end subroutine + subroutine check_all_nans() + real, allocatable :: a(:,:,:) + logical, allocatable :: m(:,:,:) + real :: nan + integer, allocatable :: r(:) + nan = 0 + nan = nan / nan + allocate(a(3,3,3), source = nan) + allocate(m(3,3,3)) + m(:,:,:) = reshape((/ .false., .false., .true. , .true. , .false., & + .true. , .false., .false., .false., .true. , & + .true. , .false., .true. , .true. , .true. , & + .false., .false., .true. , .true. , .false., & + .false., .true. , .false., .false., .true. , & + .true. , .true. /), shape(m)) + r = minloc(a, mask = m) + if (size(r, dim = 1) /= 3) stop 261 + if (any(r /= (/ 3, 1, 1 /))) stop 262 + end subroutine + elemental subroutine set(o, i) + integer, intent(out) :: o + integer, intent(in) :: i + o = i + end subroutine + subroutine check_dependencies() + integer, allocatable :: a(:,:,:) + allocate(a(3,3,3)) + ! Direct assignment + a(:,:,:) = reshape(data64(1:27), shape(a)) + a(1,1,:) = minloc(a, mask=a<8) + if (any(a(1,1,:) /= (/ 3, 2, 1 /))) stop 171 + a(:,:,:) = reshape(data64(2:28), shape(a)) + a(3,3,:) = minloc(a, mask=a<8) + if (any(a(3,3,:) /= (/ 2, 2, 1 /))) stop 172 + a(:,:,:) = reshape(data64(3:29), shape(a)) + a(1,:,1) = minloc(a, mask=a<8) + if (any(a(1,:,1) /= (/ 1, 2, 1 /))) stop 173 + a(:,:,:) = reshape(data64(5:31), shape(a)) + a(2,:,2) = minloc(a, mask=a<8) + if (any(a(2,:,2) /= (/ 2, 1, 1 /))) stop 174 + a(:,:,:) = reshape(data64(6:32), shape(a)) + a(3,:,3) = minloc(a, mask=a<8) + if (any(a(3,:,3) /= (/ 1, 1, 1 /))) stop 175 + a(:,:,:) = reshape(data64(7:33), shape(a)) + a(:,1,1) = minloc(a, mask=a<8) + if (any(a(:,1,1) /= (/ 3, 2, 2 /))) stop 176 + a(:,:,:) = reshape(data64(8:34), shape(a)) + a(:,3,3) = minloc(a, mask=a<8) + if (any(a(:,3,3) /= (/ 2, 2, 2 /))) stop 177 + ! Subroutine assignment + a(:,:,:) = reshape(data64(9:35), shape(a)) + call set(a(1,1,:), minloc(a, mask=a<8)) + if (any(a(1,1,:) /= (/ 1, 2, 2 /))) stop 181 + a(:,:,:) = reshape(data64(10:36), shape(a)) + call set(a(3,3,:), minloc(a, mask=a<8)) + if (any(a(3,3,:) /= (/ 3, 1, 2 /))) stop 182 + a(:,:,:) = reshape(data64(11:37), shape(a)) + call set(a(1,:,1), minloc(a, mask=a<8)) + if (any(a(1,:,1) /= (/ 2, 1, 2 /))) stop 183 + a(:,:,:) = reshape(data64(12:38), shape(a)) + call set(a(2,:,2), minloc(a, mask=a<8)) + if (any(a(2,:,2) /= (/ 1, 1, 2 /))) stop 184 + a(:,:,:) = reshape(data64(13:39), shape(a)) + call set(a(3,:,3), minloc(a, mask=a<8)) + if (any(a(3,:,3) /= (/ 3, 3, 1 /))) stop 185 + a(:,:,:) = reshape(data64(14:40), shape(a)) + call set(a(:,1,1), minloc(a, mask=a<8)) + if (any(a(:,1,1) /= (/ 2, 3, 1 /))) stop 186 + a(:,:,:) = reshape(data64(15:41), shape(a)) + call set(a(:,3,3), minloc(a, mask=a<8)) + if (any(a(:,3,3) /= (/ 1, 3, 1 /))) stop 187 + call set(a(1,:,:), minloc(a, dim=1)) + end subroutine check_dependencies +end program p From patchwork Wed Jul 31 20:07:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikael Morin X-Patchwork-Id: 1967286 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=orange.fr header.i=@orange.fr header.a=rsa-sha256 header.s=t20230301 header.b=HE+hDoW/; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WZ37V6HZqz1ybb for ; Thu, 1 Aug 2024 06:08:22 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E2CA5385B529 for ; Wed, 31 Jul 2024 20:08:20 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from msa.smtpout.orange.fr (smtp-83.smtpout.orange.fr [80.12.242.83]) by sourceware.org (Postfix) with ESMTPS id 2C124385828B; Wed, 31 Jul 2024 20:07:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2C124385828B Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=orange.fr Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=orange.fr ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2C124385828B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=80.12.242.83 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722456466; cv=none; b=efgHihKQeNMeNoSL4iEsWdJT4/NTFSkKi5+TXWWYUmQe2piS1w3jhDMAQUsZm7dvZfVBoh30+wLbLymLkkPdjRcmr54X7Qv+yQ5PPDm04sjM9JLMna54ZdUhLFikyFiw5WaEi7j8f14acnQjIu7xXgqPJ1Rt8F3skdukhx4mg+s= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722456466; c=relaxed/simple; bh=ucnKPqdpjIBfZlqctZLoaPqDwB/oAqd2y9lTgdMHUZg=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=jqGo+AK7aEI0/1MAIh5leq/Cf0jf0MHVTJmKJlP7WoTPE1ZA+6GOl4nlbGmXZX6CW3Cn2sUH2UhsKIv+HFq/nw5ixGzDscus0gUZWFr/gwZ/GiLp4OROl/f5VkVp5FfbTS5F+6I7Dk7zYBZPKPg4rmGTELm8UUzL5l73IHKtHYM= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from cyrano.home ([86.215.161.51]) by smtp.orange.fr with ESMTPA id ZFbisBHiIYjQzZFbrsqYRQ; Wed, 31 Jul 2024 22:07:43 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=orange.fr; s=t20230301; t=1722456463; bh=AAKXMOQNt5JxVGO/990fNewV0RM98WrhXm0myUYAwBs=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=HE+hDoW/8na++zaAXpR9X6ydkyzXLyWaesnIM+9Aogu1qJljNN6bRbpMQdcBiY235 8EffYRFlrXAMGmAniiLBo71WCRFaysxoKA+kUldYB/UaY6mfw6fK3q4MBjENwu2lQO Kf8KPZ+d0v4ZDEgcLVrztpZPllw/A9seJTStMMLfGUR8iiJk42gY+Rdv90Hxw/mKxc TpGgFGqi9kqUJvuvDi5HEzqsvQuepOP5iBg2W2yqP+IA4h9QruurEmIIYYm3oYLnGw BkEvE6f+SlozxCNHMCUn0vy2nuWJl5/I1L4rlQVUugFHzE11Kuf2qjLEMXo1UnYbTa U0ZPqjccpIXng== X-ME-Helo: cyrano.home X-ME-Auth: bW9yaW4tbWlrYWVsQG9yYW5nZS5mcg== X-ME-Date: Wed, 31 Jul 2024 22:07:43 +0200 X-ME-IP: 86.215.161.51 From: Mikael Morin To: gcc-patches@gcc.gnu.org, fortran@gcc.gnu.org Subject: [PATCH 2/8] fortran: Disable frontend passes for inlinable MINLOC/MAXLOC [PR90608] Date: Wed, 31 Jul 2024 22:07:29 +0200 Message-ID: <20240731200735.229898-3-morin-mikael@orange.fr> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240731200735.229898-1-morin-mikael@orange.fr> References: <20240731200735.229898-1-morin-mikael@orange.fr> MIME-Version: 1.0 X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org From: Mikael Morin Regression-tested on x86_64-pc-linux-gnu. OK for master? -- >8 -- Disable rewriting of MINLOC/MAXLOC expressions for which inline code generation is supported. Update the gfc_inline_intrinsic_function_p predicate (already existing) for that, with the current state of MINLOC/MAXLOC inlining support, that is only the cases of a scalar result and non-CHARACTER argument for now. This change has no effect currently, as the MINLOC/MAXLOC front-end passes only change expressions of rank 1, but the inlining control predicate gfc_inline_intrinsic_function_p returns false for those. However, later changes will extend MINLOC/MAXLOC inline expansion support to array expressions and update the inlining control predicate, and this will become effective. PR fortran/90608 gcc/fortran/ChangeLog: * frontend-passes.cc (optimize_minmaxloc): Skip if we can generate inline code for the unmodified expression. * trans-intrinsic.cc (gfc_inline_intrinsic_function_p): Add MINLOC and MAXLOC cases. --- gcc/fortran/frontend-passes.cc | 3 ++- gcc/fortran/trans-intrinsic.cc | 23 +++++++++++++++++++++++ 2 files changed, 25 insertions(+), 1 deletion(-) diff --git a/gcc/fortran/frontend-passes.cc b/gcc/fortran/frontend-passes.cc index 3c06018fdbb..8e4c6310ba8 100644 --- a/gcc/fortran/frontend-passes.cc +++ b/gcc/fortran/frontend-passes.cc @@ -2277,7 +2277,8 @@ optimize_minmaxloc (gfc_expr **e) || fn->value.function.actual == NULL || fn->value.function.actual->expr == NULL || fn->value.function.actual->expr->ts.type == BT_CHARACTER - || fn->value.function.actual->expr->rank != 1) + || fn->value.function.actual->expr->rank != 1 + || gfc_inline_intrinsic_function_p (fn)) return; *e = gfc_get_array_expr (fn->ts.type, fn->ts.kind, &fn->where); diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc index 9f3c3ce47bc..cc0d00f4e39 100644 --- a/gcc/fortran/trans-intrinsic.cc +++ b/gcc/fortran/trans-intrinsic.cc @@ -11650,6 +11650,29 @@ gfc_inline_intrinsic_function_p (gfc_expr *expr) case GFC_ISYM_TRANSPOSE: return true; + case GFC_ISYM_MINLOC: + case GFC_ISYM_MAXLOC: + { + /* Disable inline expansion if code size matters. */ + if (optimize_size) + return false; + + gfc_actual_arglist *array_arg = expr->value.function.actual; + gfc_actual_arglist *dim_arg = array_arg->next; + + gfc_expr *array = array_arg->expr; + gfc_expr *dim = dim_arg->expr; + + if (!(array->ts.type == BT_INTEGER + || array->ts.type == BT_REAL)) + return false; + + if (array->rank == 1 && dim != nullptr) + return true; + + return false; + } + default: return false; } From patchwork Wed Jul 31 20:07:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikael Morin X-Patchwork-Id: 1967290 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=orange.fr header.i=@orange.fr header.a=rsa-sha256 header.s=t20230301 header.b=OmzYEIW8; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WZ38d0Jgvz1ybb for ; Thu, 1 Aug 2024 06:09:21 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3D38A385B503 for ; Wed, 31 Jul 2024 20:09:19 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp.smtpout.orange.fr (smtp-30.smtpout.orange.fr [80.12.242.30]) by sourceware.org (Postfix) with ESMTPS id 28CA93858289; Wed, 31 Jul 2024 20:07:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 28CA93858289 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=orange.fr Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=orange.fr ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 28CA93858289 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=80.12.242.30 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722456467; cv=none; b=kamQWSzT44dqv838SgL6IR41XIpkkZIzUALqeKJXepGpb4cq35TH8l0d2nhiS96oQ051xvSztYyQ64deeh5qpCGyrc1eF9DYVaqiDqZKdC1XuhMnnoIDlsgkGGGsy0ykQ/zmFcEQjVPRerr6aZXfzkprAxGrpO6mCmiZge54C5o= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722456467; c=relaxed/simple; bh=aJJC+LZha53NiYplU3HDCr26ngJxJ8wuVYmmgMeOn6E=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=M74RJqTnJmJMUVzY4Qd10fRlUC4KjwWRCj/p30lDBupEmjINwlyqs/Xt8lFG+RCeGViiOkUvyH7V/DjBwlh+Wz0k5YArIJwwY84qMMGBZxYcxmaR8UQtbOzUC90/J+sFJ3WAESkTq+41o41XtxfVS65f+vGRnBLvnoudSVWn0wM= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from cyrano.home ([86.215.161.51]) by smtp.orange.fr with ESMTPA id ZFbisBHiIYjQzZFbrsqYRR; Wed, 31 Jul 2024 22:07:43 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=orange.fr; s=t20230301; t=1722456463; bh=NVVowaN3yy8XjRY0cDamWllcwocAKk3I4KzdX0d+gGQ=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=OmzYEIW86FIKvoFijElTnpEsBU7mCW2mOZiMHPSK4qUiawrzAHCkqh4g2OUTngr2b IwUUOD7WNEkPvuMkvwL0G1HrseGUJWJytmuSgBqXLPEbev50aGWVuGUejXXoSoyR0m xOhJ56mdBmPUSn5I47/CV6GKQrP44oATKiWdqE7yYdyMsvepEFljDYhUZAQrdXuY8l 1nMMfsenYo/Gzp0BPFsLiBfBY9tos8Q42H/F6xG7tghXL5GPNdkzOHTLLir4B9q1Yb MHSOddezVOuim3YOiCXdXjBD5vqU3wtCXQR87yvzdJrkIUVdDcGpHiCanfmgViu31X c0te5F0BpgHHQ== X-ME-Helo: cyrano.home X-ME-Auth: bW9yaW4tbWlrYWVsQG9yYW5nZS5mcg== X-ME-Date: Wed, 31 Jul 2024 22:07:43 +0200 X-ME-IP: 86.215.161.51 From: Mikael Morin To: gcc-patches@gcc.gnu.org, fortran@gcc.gnu.org Subject: [PATCH 3/8] fortran: Inline MINLOC/MAXLOC with no DIM and ARRAY of rank 1 [PR90608] Date: Wed, 31 Jul 2024 22:07:30 +0200 Message-ID: <20240731200735.229898-4-morin-mikael@orange.fr> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240731200735.229898-1-morin-mikael@orange.fr> References: <20240731200735.229898-1-morin-mikael@orange.fr> MIME-Version: 1.0 X-Spam-Status: No, score=-10.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_STOCKGEN, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org From: Mikael Morin Regression-tested on x86_64-pc-linux-gnu. OK for master? -- >8 -- Enable inline code generation for the MINLOC and MAXLOC intrinsic, if the DIM argument is not present and ARRAY has rank 1. This case is similar to the case where the result is scalar (DIM present and rank 1 ARRAY), which already supports inline expansion of the intrinsic. Both cases return the same value, with the difference that the result is an array of size 1 if DIM is absent, whereas it's a scalar if DIM is present. So all there is to do for the new case to work is hook the inline expansion with the scalarizer. PR fortran/90608 gcc/fortran/ChangeLog: * trans-array.cc (gfc_conv_ss_startstride): Set the scalarization rank based on the MINLOC/MAXLOC rank if needed. Call the inline code generation and setup the scalarizer array descriptor info in the MINLOC and MAXLOC cases. * trans-intrinsic.cc (gfc_conv_intrinsic_minmaxloc): Return the result array element if the scalarizer is setup and we are inside the loops. Restrict library function call dispatch to the case where inline expansion is not supported. Declare an array result if the expression isn't scalar. Initialize the array result single element and return the result variable if the expression isn't scalar. (walk_inline_intrinsic_minmaxloc): New function. (walk_inline_intrinsic_function): Add MINLOC and MAXLOC cases, dispatching to walk_inline_intrinsic_minmaxloc. (gfc_add_intrinsic_ss_code): Add MINLOC and MAXLOC cases. (gfc_inline_intrinsic_function_p): Return true if ARRAY has rank 1, regardless of DIM. --- gcc/fortran/trans-array.cc | 25 +++++ gcc/fortran/trans-intrinsic.cc | 198 ++++++++++++++++++++++----------- 2 files changed, 155 insertions(+), 68 deletions(-) diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc index c93a5f1e754..0c78e1fecd8 100644 --- a/gcc/fortran/trans-array.cc +++ b/gcc/fortran/trans-array.cc @@ -4771,6 +4771,8 @@ gfc_conv_ss_startstride (gfc_loopinfo * loop) case GFC_ISYM_UBOUND: case GFC_ISYM_LCOBOUND: case GFC_ISYM_UCOBOUND: + case GFC_ISYM_MAXLOC: + case GFC_ISYM_MINLOC: case GFC_ISYM_SHAPE: case GFC_ISYM_THIS_IMAGE: loop->dimen = ss->dimen; @@ -4820,6 +4822,29 @@ done: case GFC_SS_INTRINSIC: switch (expr->value.function.isym->id) { + case GFC_ISYM_MINLOC: + case GFC_ISYM_MAXLOC: + { + gfc_se se; + gfc_init_se (&se, nullptr); + se.loop = loop; + se.ss = ss; + gfc_conv_intrinsic_function (&se, expr); + gfc_add_block_to_block (&outer_loop->pre, &se.pre); + gfc_add_block_to_block (&outer_loop->post, &se.post); + + info->descriptor = se.expr; + + info->data = gfc_conv_array_data (info->descriptor); + info->data = gfc_evaluate_now (info->data, &outer_loop->pre); + + info->offset = gfc_index_zero_node; + info->start[0] = gfc_index_zero_node; + info->end[0] = gfc_index_zero_node; + info->stride[0] = gfc_index_one_node; + continue; + } + /* Fall through to supply start and stride. */ case GFC_ISYM_LBOUND: case GFC_ISYM_UBOUND: diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc index cc0d00f4e39..a947dd1ba0b 100644 --- a/gcc/fortran/trans-intrinsic.cc +++ b/gcc/fortran/trans-intrinsic.cc @@ -5273,66 +5273,69 @@ strip_kind_from_actual (gfc_actual_arglist * actual) we need to handle. For performance reasons we sometimes create two loops instead of one, where the second one is much simpler. Examples for minloc intrinsic: - 1) Result is an array, a call is generated - 2) Array mask is used and NaNs need to be supported: - limit = Infinity; - pos = 0; - S = from; - while (S <= to) { - if (mask[S]) { - if (pos == 0) pos = S + (1 - from); - if (a[S] <= limit) { limit = a[S]; pos = S + (1 - from); goto lab1; } - } - S++; - } - goto lab2; - lab1:; - while (S <= to) { - if (mask[S]) if (a[S] < limit) { limit = a[S]; pos = S + (1 - from); } - S++; - } - lab2:; - 3) NaNs need to be supported, but it is known at compile time or cheaply - at runtime whether array is nonempty or not: - limit = Infinity; - pos = 0; - S = from; - while (S <= to) { - if (a[S] <= limit) { limit = a[S]; pos = S + (1 - from); goto lab1; } - S++; - } - if (from <= to) pos = 1; - goto lab2; - lab1:; - while (S <= to) { - if (a[S] < limit) { limit = a[S]; pos = S + (1 - from); } - S++; - } - lab2:; - 4) NaNs aren't supported, array mask is used: - limit = infinities_supported ? Infinity : huge (limit); - pos = 0; - S = from; - while (S <= to) { - if (mask[S]) { limit = a[S]; pos = S + (1 - from); goto lab1; } - S++; - } - goto lab2; - lab1:; - while (S <= to) { - if (mask[S]) if (a[S] < limit) { limit = a[S]; pos = S + (1 - from); } - S++; - } - lab2:; - 5) Same without array mask: - limit = infinities_supported ? Infinity : huge (limit); - pos = (from <= to) ? 1 : 0; - S = from; - while (S <= to) { - if (a[S] < limit) { limit = a[S]; pos = S + (1 - from); } - S++; - } - For 3) and 5), if mask is scalar, this all goes into a conditional, + A: Result is scalar. + 1) Array mask is used and NaNs need to be supported: + limit = Infinity; + pos = 0; + S = from; + while (S <= to) { + if (mask[S]) { + if (pos == 0) pos = S + (1 - from); + if (a[S] <= limit) { limit = a[S]; pos = S + (1 - from); goto lab1; } + } + S++; + } + goto lab2; + lab1:; + while (S <= to) { + if (mask[S]) if (a[S] < limit) { limit = a[S]; pos = S + (1 - from); } + S++; + } + lab2:; + 2) NaNs need to be supported, but it is known at compile time or cheaply + at runtime whether array is nonempty or not: + limit = Infinity; + pos = 0; + S = from; + while (S <= to) { + if (a[S] <= limit) { limit = a[S]; pos = S + (1 - from); goto lab1; } + S++; + } + if (from <= to) pos = 1; + goto lab2; + lab1:; + while (S <= to) { + if (a[S] < limit) { limit = a[S]; pos = S + (1 - from); } + S++; + } + lab2:; + 3) NaNs aren't supported, array mask is used: + limit = infinities_supported ? Infinity : huge (limit); + pos = 0; + S = from; + while (S <= to) { + if (mask[S]) { limit = a[S]; pos = S + (1 - from); goto lab1; } + S++; + } + goto lab2; + lab1:; + while (S <= to) { + if (mask[S]) if (a[S] < limit) { limit = a[S]; pos = S + (1 - from); } + S++; + } + lab2:; + 4) Same without array mask: + limit = infinities_supported ? Infinity : huge (limit); + pos = (from <= to) ? 1 : 0; + S = from; + while (S <= to) { + if (a[S] < limit) { limit = a[S]; pos = S + (1 - from); } + S++; + } + B) ARRAY has rank 1, and DIM is absent. Use the same code as the scalar + case and wrap the result in an array. + C) Otherwise, a call is generated + For 2) and 4), if mask is scalar, this all goes into a conditional, setting pos = 0; in the else branch. Since we now also support the BACK argument, instead of using @@ -5346,7 +5349,7 @@ strip_kind_from_actual (gfc_actual_arglist * actual) .... The optimizer is smart enough to move the condition out of the loop. - The are now marked as unlikely to for further speedup. */ + They are now marked as unlikely too for further speedup. */ static void gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) @@ -5377,6 +5380,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) gfc_expr *backexpr; gfc_se backse; tree pos; + tree result_var = NULL_TREE; int n; bool optional_mask; @@ -5392,8 +5396,19 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) if (se->ss) { - gfc_conv_intrinsic_funcall (se, expr); - return; + if (se->ss->info->useflags) + { + /* The inline implementation of MINLOC/MAXLOC has been generated + before, out of the scalarization loop; now we can just use the + result. */ + gfc_conv_tmp_array_ref (se); + return; + } + else if (!gfc_inline_intrinsic_function_p (expr)) + { + gfc_conv_intrinsic_funcall (se, expr); + return; + } } arrayexpr = actual->expr; @@ -5419,10 +5434,29 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) return; } + type = gfc_typenode_for_spec (&expr->ts); + + if (expr->rank > 0) + { + gfc_array_spec as; + memset (&as, 0, sizeof (as)); + + as.rank = 1; + as.lower[0] = gfc_get_int_expr (gfc_index_integer_kind, + &arrayexpr->where, + HOST_WIDE_INT_1); + as.upper[0] = gfc_get_int_expr (gfc_index_integer_kind, + &arrayexpr->where, + HOST_WIDE_INT_1); + + tree array = gfc_get_nodesc_array_type (type, &as, PACKED_STATIC, true); + + result_var = gfc_create_var (array, "loc_result"); + } + /* Initialize the result. */ pos = gfc_create_var (gfc_array_index_type, "pos"); offset = gfc_create_var (gfc_array_index_type, "offset"); - type = gfc_typenode_for_spec (&expr->ts); /* Walk the arguments. */ arrayss = gfc_walk_expr (arrayexpr); @@ -5826,7 +5860,18 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) } gfc_cleanup_loop (&loop); - se->expr = convert (type, pos); + tree value = convert (type, pos); + if (expr->rank > 0) + { + tree res_arr_ref = gfc_build_array_ref (result_var, gfc_index_zero_node, + NULL_TREE, true); + + gfc_add_modify (&se->pre, res_arr_ref, value); + + se->expr = result_var; + } + else + se->expr = value; } /* Emit code for findloc. */ @@ -11535,6 +11580,19 @@ walk_inline_intrinsic_arith (gfc_ss *ss, gfc_expr *expr) } +/* Create the gfc_ss list for the arguments to MINLOC or MAXLOC when the + function is to be inlined. */ + +static gfc_ss * +walk_inline_intrinsic_minmaxloc (gfc_ss *ss, gfc_expr *expr ATTRIBUTE_UNUSED) +{ + if (expr->rank == 0) + return ss; + + return gfc_get_array_ss (ss, expr, 1, GFC_SS_INTRINSIC); +} + + static gfc_ss * walk_inline_intrinsic_function (gfc_ss * ss, gfc_expr * expr) { @@ -11548,6 +11606,10 @@ walk_inline_intrinsic_function (gfc_ss * ss, gfc_expr * expr) case GFC_ISYM_TRANSPOSE: return walk_inline_intrinsic_transpose (ss, expr); + case GFC_ISYM_MAXLOC: + case GFC_ISYM_MINLOC: + return walk_inline_intrinsic_minmaxloc (ss, expr); + default: gcc_unreachable (); } @@ -11567,6 +11629,8 @@ gfc_add_intrinsic_ss_code (gfc_loopinfo * loop ATTRIBUTE_UNUSED, gfc_ss * ss) case GFC_ISYM_LBOUND: case GFC_ISYM_UCOBOUND: case GFC_ISYM_LCOBOUND: + case GFC_ISYM_MAXLOC: + case GFC_ISYM_MINLOC: case GFC_ISYM_THIS_IMAGE: case GFC_ISYM_SHAPE: break; @@ -11658,16 +11722,14 @@ gfc_inline_intrinsic_function_p (gfc_expr *expr) return false; gfc_actual_arglist *array_arg = expr->value.function.actual; - gfc_actual_arglist *dim_arg = array_arg->next; gfc_expr *array = array_arg->expr; - gfc_expr *dim = dim_arg->expr; if (!(array->ts.type == BT_INTEGER || array->ts.type == BT_REAL)) return false; - if (array->rank == 1 && dim != nullptr) + if (array->rank == 1) return true; return false; From patchwork Wed Jul 31 20:07:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikael Morin X-Patchwork-Id: 1967292 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=orange.fr header.i=@orange.fr header.a=rsa-sha256 header.s=t20230301 header.b=fVRWisZL; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WZ3BP57j4z1ybX for ; Thu, 1 Aug 2024 06:10:53 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DD1303858294 for ; Wed, 31 Jul 2024 20:10:51 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from msa.smtpout.orange.fr (out-71.smtpout.orange.fr [193.252.22.71]) by sourceware.org (Postfix) with ESMTPS id 62904385842C; Wed, 31 Jul 2024 20:07:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 62904385842C Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=orange.fr Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=orange.fr ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 62904385842C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=193.252.22.71 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722456471; cv=none; b=PS0oQ1QE/kGPb1shLEKUOynTYAFjB0r9RJ99Rsmu3s/GvUYFxSejZrJwQIY0Db6ISzIvpPyLvqYNpAwnd76AxbyKX77qPmDv1kuTUvrlUxuMXhcYaL8ThVgRRQePlyE0E6SRCpHD3miA3UNVcgQs+KOxMywhi+BQz6SR+hh/VSI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722456471; c=relaxed/simple; bh=D+A8d61NHwFTtJMul1nowMeTaDpuXv75CYwTFpSY7uw=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=ER7/zgjfGf3NDMILWQ97qJ/p7snV2TrifZEuR/cheMga4hLGCC3j24N+OPrIdfHpJJRp9vjj1HWaJ7tKkHUtuSSlPquQ2uBFBM7GNnOHyhBn9ZR6dU6GzeJjpeo+9Kode/O9YFqLH/iCzOce6hJCdxw+aG9/bbOkv/3U3HmdTNA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from cyrano.home ([86.215.161.51]) by smtp.orange.fr with ESMTPA id ZFbisBHiIYjQzZFbrsqYRU; Wed, 31 Jul 2024 22:07:43 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=orange.fr; s=t20230301; t=1722456463; bh=gmqke6nce7eV90Bs35OwNwnafwoIXyfhEO+qZrOqHq8=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=fVRWisZLOfXWEQmJdxPdxWVmBTwRc0qK2OfR/bEcXo4diAg8vIu6Pa+t5MtQEWSu0 Cyeo04YCArbNrQRanUS6PGk+qttiCX3WMLGrEQd3lndihHwHTfcbCoUFOazAo722QY JyheZrekjJbieR5ztmqnC1jsukmb/uk3eMP9ibGqEVZIfovuVgWr5IPsMxMWvXhbZa fy96GGAK1/urfkokxwkltBcA0Z9H3fnXpZ9h7WtCFtf3RQZ/nDnxSwveUz89t14o4D XFCY9BjHQcXaEWkApvf2Gv0aG9b5mGdvQ+aZXMhU8ZQTChjU75Q8ArbvUVe/UVgac3 f0rt2AYg4boTg== X-ME-Helo: cyrano.home X-ME-Auth: bW9yaW4tbWlrYWVsQG9yYW5nZS5mcg== X-ME-Date: Wed, 31 Jul 2024 22:07:43 +0200 X-ME-IP: 86.215.161.51 From: Mikael Morin To: gcc-patches@gcc.gnu.org, fortran@gcc.gnu.org Subject: [PATCH 4/8] fortran: Outline array bound check generation code Date: Wed, 31 Jul 2024 22:07:31 +0200 Message-ID: <20240731200735.229898-5-morin-mikael@orange.fr> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240731200735.229898-1-morin-mikael@orange.fr> References: <20240731200735.229898-1-morin-mikael@orange.fr> MIME-Version: 1.0 X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org From: Mikael Morin The next patch will need reindenting of the array bound check generation code. This outlines it to its own function beforehand, reducing the churn in the next patch. Regression-tested on x86_64-pc-linux-gnu. OK for master? -- >8 -- gcc/fortran/ChangeLog: * trans-array.cc (gfc_conv_ss_startstride): Move array bound check generation code... (add_check_section_in_array_bounds): ... here as a new function. --- gcc/fortran/trans-array.cc | 297 ++++++++++++++++++------------------- 1 file changed, 143 insertions(+), 154 deletions(-) diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc index 0c78e1fecd8..99a603a3afb 100644 --- a/gcc/fortran/trans-array.cc +++ b/gcc/fortran/trans-array.cc @@ -4736,6 +4736,146 @@ gfc_conv_section_startstride (stmtblock_t * block, gfc_ss * ss, int dim) } +/* Generate in INNER the bounds checking code along the dimension DIM for + the array associated with SS_INFO. */ + +static void +add_check_section_in_array_bounds (stmtblock_t *inner, gfc_ss_info *ss_info, + int dim) +{ + gfc_expr *expr = ss_info->expr; + locus *expr_loc = &expr->where; + const char *expr_name = expr->symtree->name; + + gfc_array_info *info = &ss_info->data.array; + + bool check_upper; + if (dim == info->ref->u.ar.dimen - 1 + && info->ref->u.ar.as->type == AS_ASSUMED_SIZE) + check_upper = false; + else + check_upper = true; + + /* Zero stride is not allowed. */ + tree tmp = fold_build2_loc (input_location, EQ_EXPR, logical_type_node, + info->stride[dim], gfc_index_zero_node); + char * msg = xasprintf ("Zero stride is not allowed, for dimension %d " + "of array '%s'", dim + 1, expr_name); + gfc_trans_runtime_check (true, false, tmp, inner, expr_loc, msg); + free (msg); + + tree desc = info->descriptor; + + /* This is the run-time equivalent of resolve.cc's + check_dimension. The logical is more readable there + than it is here, with all the trees. */ + tree lbound = gfc_conv_array_lbound (desc, dim); + tree end = info->end[dim]; + tree ubound = check_upper ? gfc_conv_array_ubound (desc, dim) : NULL_TREE; + + /* non_zerosized is true when the selected range is not + empty. */ + tree stride_pos = fold_build2_loc (input_location, GT_EXPR, logical_type_node, + info->stride[dim], gfc_index_zero_node); + tmp = fold_build2_loc (input_location, LE_EXPR, logical_type_node, + info->start[dim], end); + stride_pos = fold_build2_loc (input_location, TRUTH_AND_EXPR, + logical_type_node, stride_pos, tmp); + + tree stride_neg = fold_build2_loc (input_location, LT_EXPR, logical_type_node, + info->stride[dim], gfc_index_zero_node); + tmp = fold_build2_loc (input_location, GE_EXPR, logical_type_node, + info->start[dim], end); + stride_neg = fold_build2_loc (input_location, TRUTH_AND_EXPR, + logical_type_node, stride_neg, tmp); + tree non_zerosized = fold_build2_loc (input_location, TRUTH_OR_EXPR, + logical_type_node, stride_pos, + stride_neg); + + /* Check the start of the range against the lower and upper + bounds of the array, if the range is not empty. + If upper bound is present, include both bounds in the + error message. */ + if (check_upper) + { + tmp = fold_build2_loc (input_location, LT_EXPR, logical_type_node, + info->start[dim], lbound); + tmp = fold_build2_loc (input_location, TRUTH_AND_EXPR, logical_type_node, + non_zerosized, tmp); + tree tmp2 = fold_build2_loc (input_location, GT_EXPR, logical_type_node, + info->start[dim], ubound); + tmp2 = fold_build2_loc (input_location, TRUTH_AND_EXPR, logical_type_node, + non_zerosized, tmp2); + msg = xasprintf ("Index '%%ld' of dimension %d of array '%s' outside of " + "expected range (%%ld:%%ld)", dim + 1, expr_name); + gfc_trans_runtime_check (true, false, tmp, inner, expr_loc, msg, + fold_convert (long_integer_type_node, info->start[dim]), + fold_convert (long_integer_type_node, lbound), + fold_convert (long_integer_type_node, ubound)); + gfc_trans_runtime_check (true, false, tmp2, inner, expr_loc, msg, + fold_convert (long_integer_type_node, info->start[dim]), + fold_convert (long_integer_type_node, lbound), + fold_convert (long_integer_type_node, ubound)); + free (msg); + } + else + { + tmp = fold_build2_loc (input_location, LT_EXPR, logical_type_node, + info->start[dim], lbound); + tmp = fold_build2_loc (input_location, TRUTH_AND_EXPR, logical_type_node, + non_zerosized, tmp); + msg = xasprintf ("Index '%%ld' of dimension %d of array '%s' below " + "lower bound of %%ld", dim + 1, expr_name); + gfc_trans_runtime_check (true, false, tmp, inner, expr_loc, msg, + fold_convert (long_integer_type_node, info->start[dim]), + fold_convert (long_integer_type_node, lbound)); + free (msg); + } + + /* Compute the last element of the range, which is not + necessarily "end" (think 0:5:3, which doesn't contain 5) + and check it against both lower and upper bounds. */ + + tmp = fold_build2_loc (input_location, MINUS_EXPR, gfc_array_index_type, + end, info->start[dim]); + tmp = fold_build2_loc (input_location, TRUNC_MOD_EXPR, gfc_array_index_type, + tmp, info->stride[dim]); + tmp = fold_build2_loc (input_location, MINUS_EXPR, gfc_array_index_type, + end, tmp); + tree tmp2 = fold_build2_loc (input_location, LT_EXPR, logical_type_node, + tmp, lbound); + tmp2 = fold_build2_loc (input_location, TRUTH_AND_EXPR, logical_type_node, + non_zerosized, tmp2); + if (check_upper) + { + tree tmp3 = fold_build2_loc (input_location, GT_EXPR, logical_type_node, + tmp, ubound); + tmp3 = fold_build2_loc (input_location, TRUTH_AND_EXPR, logical_type_node, + non_zerosized, tmp3); + msg = xasprintf ("Index '%%ld' of dimension %d of array '%s' outside of " + "expected range (%%ld:%%ld)", dim + 1, expr_name); + gfc_trans_runtime_check (true, false, tmp2, inner, expr_loc, msg, + fold_convert (long_integer_type_node, tmp), + fold_convert (long_integer_type_node, ubound), + fold_convert (long_integer_type_node, lbound)); + gfc_trans_runtime_check (true, false, tmp3, inner, expr_loc, msg, + fold_convert (long_integer_type_node, tmp), + fold_convert (long_integer_type_node, ubound), + fold_convert (long_integer_type_node, lbound)); + free (msg); + } + else + { + msg = xasprintf ("Index '%%ld' of dimension %d of array '%s' below " + "lower bound of %%ld", dim + 1, expr_name); + gfc_trans_runtime_check (true, false, tmp2, inner, expr_loc, msg, + fold_convert (long_integer_type_node, tmp), + fold_convert (long_integer_type_node, lbound)); + free (msg); + } +} + + /* Calculates the range start and stride for a SS chain. Also gets the descriptor and data pointer. The range of vector subscripts is the size of the vector. Array bounds are also checked. */ @@ -4746,7 +4886,6 @@ gfc_conv_ss_startstride (gfc_loopinfo * loop) int n; tree tmp; gfc_ss *ss; - tree desc; gfc_loopinfo * const outer_loop = outermost_loop (loop); @@ -4916,10 +5055,8 @@ done: if (gfc_option.rtcheck & GFC_RTCHECK_BOUNDS) { stmtblock_t block; - tree lbound, ubound; - tree end; tree size[GFC_MAX_DIMENSIONS]; - tree stride_pos, stride_neg, non_zerosized, tmp2, tmp3; + tree tmp3; gfc_array_info *info; char *msg; int dim; @@ -4985,163 +5122,15 @@ done: dimensions are checked later. */ for (n = 0; n < loop->dimen; n++) { - bool check_upper; - dim = ss->dim[n]; if (info->ref->u.ar.dimen_type[dim] != DIMEN_RANGE) continue; - if (dim == info->ref->u.ar.dimen - 1 - && info->ref->u.ar.as->type == AS_ASSUMED_SIZE) - check_upper = false; - else - check_upper = true; - - /* Zero stride is not allowed. */ - tmp = fold_build2_loc (input_location, EQ_EXPR, logical_type_node, - info->stride[dim], gfc_index_zero_node); - msg = xasprintf ("Zero stride is not allowed, for dimension %d " - "of array '%s'", dim + 1, expr_name); - gfc_trans_runtime_check (true, false, tmp, &inner, - expr_loc, msg); - free (msg); - - desc = info->descriptor; - - /* This is the run-time equivalent of resolve.cc's - check_dimension(). The logical is more readable there - than it is here, with all the trees. */ - lbound = gfc_conv_array_lbound (desc, dim); - end = info->end[dim]; - if (check_upper) - ubound = gfc_conv_array_ubound (desc, dim); - else - ubound = NULL; - - /* non_zerosized is true when the selected range is not - empty. */ - stride_pos = fold_build2_loc (input_location, GT_EXPR, - logical_type_node, info->stride[dim], - gfc_index_zero_node); - tmp = fold_build2_loc (input_location, LE_EXPR, logical_type_node, - info->start[dim], end); - stride_pos = fold_build2_loc (input_location, TRUTH_AND_EXPR, - logical_type_node, stride_pos, tmp); - - stride_neg = fold_build2_loc (input_location, LT_EXPR, - logical_type_node, - info->stride[dim], gfc_index_zero_node); - tmp = fold_build2_loc (input_location, GE_EXPR, logical_type_node, - info->start[dim], end); - stride_neg = fold_build2_loc (input_location, TRUTH_AND_EXPR, - logical_type_node, - stride_neg, tmp); - non_zerosized = fold_build2_loc (input_location, TRUTH_OR_EXPR, - logical_type_node, - stride_pos, stride_neg); - - /* Check the start of the range against the lower and upper - bounds of the array, if the range is not empty. - If upper bound is present, include both bounds in the - error message. */ - if (check_upper) - { - tmp = fold_build2_loc (input_location, LT_EXPR, - logical_type_node, - info->start[dim], lbound); - tmp = fold_build2_loc (input_location, TRUTH_AND_EXPR, - logical_type_node, - non_zerosized, tmp); - tmp2 = fold_build2_loc (input_location, GT_EXPR, - logical_type_node, - info->start[dim], ubound); - tmp2 = fold_build2_loc (input_location, TRUTH_AND_EXPR, - logical_type_node, - non_zerosized, tmp2); - msg = xasprintf ("Index '%%ld' of dimension %d of array '%s' " - "outside of expected range (%%ld:%%ld)", - dim + 1, expr_name); - gfc_trans_runtime_check (true, false, tmp, &inner, - expr_loc, msg, - fold_convert (long_integer_type_node, info->start[dim]), - fold_convert (long_integer_type_node, lbound), - fold_convert (long_integer_type_node, ubound)); - gfc_trans_runtime_check (true, false, tmp2, &inner, - expr_loc, msg, - fold_convert (long_integer_type_node, info->start[dim]), - fold_convert (long_integer_type_node, lbound), - fold_convert (long_integer_type_node, ubound)); - free (msg); - } - else - { - tmp = fold_build2_loc (input_location, LT_EXPR, - logical_type_node, - info->start[dim], lbound); - tmp = fold_build2_loc (input_location, TRUTH_AND_EXPR, - logical_type_node, non_zerosized, tmp); - msg = xasprintf ("Index '%%ld' of dimension %d of array '%s' " - "below lower bound of %%ld", - dim + 1, expr_name); - gfc_trans_runtime_check (true, false, tmp, &inner, - expr_loc, msg, - fold_convert (long_integer_type_node, info->start[dim]), - fold_convert (long_integer_type_node, lbound)); - free (msg); - } - - /* Compute the last element of the range, which is not - necessarily "end" (think 0:5:3, which doesn't contain 5) - and check it against both lower and upper bounds. */ - - tmp = fold_build2_loc (input_location, MINUS_EXPR, - gfc_array_index_type, end, - info->start[dim]); - tmp = fold_build2_loc (input_location, TRUNC_MOD_EXPR, - gfc_array_index_type, tmp, - info->stride[dim]); - tmp = fold_build2_loc (input_location, MINUS_EXPR, - gfc_array_index_type, end, tmp); - tmp2 = fold_build2_loc (input_location, LT_EXPR, - logical_type_node, tmp, lbound); - tmp2 = fold_build2_loc (input_location, TRUTH_AND_EXPR, - logical_type_node, non_zerosized, tmp2); - if (check_upper) - { - tmp3 = fold_build2_loc (input_location, GT_EXPR, - logical_type_node, tmp, ubound); - tmp3 = fold_build2_loc (input_location, TRUTH_AND_EXPR, - logical_type_node, non_zerosized, tmp3); - msg = xasprintf ("Index '%%ld' of dimension %d of array '%s' " - "outside of expected range (%%ld:%%ld)", - dim + 1, expr_name); - gfc_trans_runtime_check (true, false, tmp2, &inner, - expr_loc, msg, - fold_convert (long_integer_type_node, tmp), - fold_convert (long_integer_type_node, ubound), - fold_convert (long_integer_type_node, lbound)); - gfc_trans_runtime_check (true, false, tmp3, &inner, - expr_loc, msg, - fold_convert (long_integer_type_node, tmp), - fold_convert (long_integer_type_node, ubound), - fold_convert (long_integer_type_node, lbound)); - free (msg); - } - else - { - msg = xasprintf ("Index '%%ld' of dimension %d of array '%s' " - "below lower bound of %%ld", - dim + 1, expr_name); - gfc_trans_runtime_check (true, false, tmp2, &inner, - expr_loc, msg, - fold_convert (long_integer_type_node, tmp), - fold_convert (long_integer_type_node, lbound)); - free (msg); - } + add_check_section_in_array_bounds (&inner, ss_info, dim); /* Check the section sizes match. */ tmp = fold_build2_loc (input_location, MINUS_EXPR, - gfc_array_index_type, end, + gfc_array_index_type, info->end[dim], info->start[dim]); tmp = fold_build2_loc (input_location, FLOOR_DIV_EXPR, gfc_array_index_type, tmp, From patchwork Wed Jul 31 20:07:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikael Morin X-Patchwork-Id: 1967289 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=orange.fr header.i=@orange.fr header.a=rsa-sha256 header.s=t20230301 header.b=fbDVQeYe; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WZ38R0Rs3z1ybb for ; Thu, 1 Aug 2024 06:09:11 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 45B64385332C for ; Wed, 31 Jul 2024 20:09:09 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp.smtpout.orange.fr (smtp-29.smtpout.orange.fr [80.12.242.29]) by sourceware.org (Postfix) with ESMTPS id B06363858414; Wed, 31 Jul 2024 20:07:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B06363858414 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=orange.fr Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=orange.fr ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B06363858414 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=80.12.242.29 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722456468; cv=none; b=bgFKAXNKJ0tjqee+z4k1fZzkxzrGxjUYkY/DHwaSK93iZgjPRN1pexYJXbgyeysPNN/qclaWA7CSJjCvChKVRExigd+h+o6vKNZmPHsEBbi+Ixc2sk6KALzdMKey6cOqIWGvUVWTaynUhoZSBD6IniULfguLL8p7yJ76Hle+h/c= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722456468; c=relaxed/simple; bh=6sCuTyx37kYH2npPle2GWhu3Wi9+x/Qm+e7EOg6MFiQ=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=cp1JfkSbwuSreVsCt5v5s+RDFkTcYZlZANhebxYeZNqCiOft0KNIgp2ozcHOk1hClFzsN77ZTkpa9GT4LMEMa6M/fZYzorT6Ysqkp37GBGkpUy3YRDGbFOLBu6K+gjoPAqoqUL+8lm1W89gfNHcvlEHklPRqr9j026+xJhMMNyo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from cyrano.home ([86.215.161.51]) by smtp.orange.fr with ESMTPA id ZFbisBHiIYjQzZFbrsqYRW; Wed, 31 Jul 2024 22:07:43 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=orange.fr; s=t20230301; t=1722456463; bh=4zU9SPH98Ku5UA8ZYsN4MGCjpcVWoCVe4Q9PTmIEy04=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=fbDVQeYeTOaoptE1TwYTYE+k+6bRpsDKQU19/7NK+9j56jA2hi98lgJO2ULKoZkDp +BAweafN5yw07HDcLNOlsmlfjydARcCK0GpC8REJDn55xki/nht5ReIsEVubsUThlR 6JEzo/jZm0FSqta2UyjkCToGbmsBYumnLOIRKMRIaukMmJM2m68V3H8xlsu+eHmGwY sQK+ScvOA1XgQGBN8xSDnD2nOtye/4y5GR9iFIAC8Xl3xETTqcrbs0z7M8FdiTyg/W /YyKF77xD0H9gS7oeyZuvgGT37eNeG2A2whVpHxzmni7MDTQz4DT4iCh8pthCmK25t DU6kq0wayw1pg== X-ME-Helo: cyrano.home X-ME-Auth: bW9yaW4tbWlrYWVsQG9yYW5nZS5mcg== X-ME-Date: Wed, 31 Jul 2024 22:07:43 +0200 X-ME-IP: 86.215.161.51 From: Mikael Morin To: gcc-patches@gcc.gnu.org, fortran@gcc.gnu.org Subject: [PATCH 5/8] fortran: Inline integral MINLOC/MAXLOC with no DIM and no MASK [PR90608] Date: Wed, 31 Jul 2024 22:07:32 +0200 Message-ID: <20240731200735.229898-6-morin-mikael@orange.fr> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240731200735.229898-1-morin-mikael@orange.fr> References: <20240731200735.229898-1-morin-mikael@orange.fr> MIME-Version: 1.0 X-Spam-Status: No, score=-10.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_STOCKGEN, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org From: Mikael Morin Regression-tested on x86_64-pc-linux-gnu. OK for master? -- >8 -- Enable generation of inline code for the MINLOC and MAXLOC intrinsic, if the ARRAY argument is of integral type and of any rank (only the rank 1 case was previously inlined), and neither DIM nor MASK arguments are present. This needs a few adjustments in gfc_conv_intrinsic_minmaxloc, mainly to replace the single variables POS and OFFSET, with collections of variables, one variable per dimension each. The restriction to integral ARRAY and absent MASK limits the scope of the change to the cases where we generate single loop inline code. The code generation for the second loop is only accessible with ARRAY of rank 1, so it can continue using a single variable. A later change will extend inlining to the double loop cases. There is some bounds checking code that was previously handled by the library, and that needed some changes in the scalarizer to avoid regressing. The bounds check code generation was already supported by the scalarizer, but it was only applying to array reference sections, checking both for array bound violation and for shape conformability between all the involved arrays. With this change, for MINLOC or MAXLOC, enable the conformability check between all the scalarized arrays, and disable the array bound violation check. PR fortran/90608 gcc/fortran/ChangeLog: * trans-array.cc (gfc_conv_ss_startstride): Set the MINLOC/MAXLOC result upper bound using the rank of the ARRAY argument. Ajdust the error message for intrinsic result arrays. Only check array bounds for array references. Move bound check decision code... (bounds_check_needed): ... here as a new predicate. Allow bound check for MINLOC/MAXLOC intrinsic results. * trans-intrinsic.cc (gfc_conv_intrinsic_minmaxloc): Change the result array upper bound to the rank of ARRAY. Update the NONEMPTY variable to depend on the non-empty extent of every dimension. Use one variable per dimension instead of a single variable for the position and the offset. Update their declaration, initialization, and update to affect the variable of each dimension. Use the first variable only in areas only accessed with rank 1 ARRAY argument. Set every element of the result using its corresponding variable. (gfc_inline_intrinsic_function_p): Return true for integral ARRAY and absent DIM and MASK. gcc/testsuite/ChangeLog: * gfortran.dg/maxloc_bounds_4.f90: Additionally accept the error message emitted by the scalarizer. --- gcc/fortran/trans-array.cc | 70 ++++++-- gcc/fortran/trans-intrinsic.cc | 150 +++++++++++++----- gcc/testsuite/gfortran.dg/maxloc_bounds_4.f90 | 4 +- 3 files changed, 166 insertions(+), 58 deletions(-) diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc index 99a603a3afb..76448c8ac0e 100644 --- a/gcc/fortran/trans-array.cc +++ b/gcc/fortran/trans-array.cc @@ -4876,6 +4876,35 @@ add_check_section_in_array_bounds (stmtblock_t *inner, gfc_ss_info *ss_info, } +/* Tells whether we need to generate bounds checking code for the array + associated with SS. */ + +bool +bounds_check_needed (gfc_ss *ss) +{ + /* Catch allocatable lhs in f2003. */ + if (flag_realloc_lhs && ss->no_bounds_check) + return false; + + gfc_ss_info *ss_info = ss->info; + if (ss_info->type == GFC_SS_SECTION) + return true; + + if (!(ss_info->type == GFC_SS_INTRINSIC + && ss_info->expr + && ss_info->expr->expr_type == EXPR_FUNCTION)) + return false; + + gfc_intrinsic_sym *isym = ss_info->expr->value.function.isym; + if (!(isym + && (isym->id == GFC_ISYM_MAXLOC + || isym->id == GFC_ISYM_MINLOC))) + return false; + + return gfc_inline_intrinsic_function_p (ss_info->expr); +} + + /* Calculates the range start and stride for a SS chain. Also gets the descriptor and data pointer. The range of vector subscripts is the size of the vector. Array bounds are also checked. */ @@ -4977,10 +5006,17 @@ done: info->data = gfc_conv_array_data (info->descriptor); info->data = gfc_evaluate_now (info->data, &outer_loop->pre); - info->offset = gfc_index_zero_node; + gfc_expr *array = expr->value.function.actual->expr; + tree rank = build_int_cst (gfc_array_index_type, array->rank); + + tree tmp = fold_build2_loc (input_location, MINUS_EXPR, + gfc_array_index_type, rank, + gfc_index_one_node); + + info->end[0] = gfc_evaluate_now (tmp, &outer_loop->pre); info->start[0] = gfc_index_zero_node; - info->end[0] = gfc_index_zero_node; info->stride[0] = gfc_index_one_node; + info->offset = gfc_index_zero_node; continue; } @@ -5098,14 +5134,10 @@ done: const char *expr_name; char *ref_name = NULL; + if (!bounds_check_needed (ss)) + continue; + ss_info = ss->info; - if (ss_info->type != GFC_SS_SECTION) - continue; - - /* Catch allocatable lhs in f2003. */ - if (flag_realloc_lhs && ss->no_bounds_check) - continue; - expr = ss_info->expr; expr_loc = &expr->where; if (expr->ref) @@ -5123,10 +5155,13 @@ done: for (n = 0; n < loop->dimen; n++) { dim = ss->dim[n]; - if (info->ref->u.ar.dimen_type[dim] != DIMEN_RANGE) - continue; + if (ss_info->type == GFC_SS_SECTION) + { + if (info->ref->u.ar.dimen_type[dim] != DIMEN_RANGE) + continue; - add_check_section_in_array_bounds (&inner, ss_info, dim); + add_check_section_in_array_bounds (&inner, ss_info, dim); + } /* Check the section sizes match. */ tmp = fold_build2_loc (input_location, MINUS_EXPR, @@ -5147,9 +5182,14 @@ done: { tmp3 = fold_build2_loc (input_location, NE_EXPR, logical_type_node, tmp, size[n]); - msg = xasprintf ("Array bound mismatch for dimension %d " - "of array '%s' (%%ld/%%ld)", - dim + 1, expr_name); + if (ss_info->type == GFC_SS_INTRINSIC) + msg = xasprintf ("Extent mismatch for dimension %d of the " + "result of intrinsic '%s' (%%ld/%%ld)", + dim + 1, expr_name); + else + msg = xasprintf ("Array bound mismatch for dimension %d " + "of array '%s' (%%ld/%%ld)", + dim + 1, expr_name); gfc_trans_runtime_check (true, false, tmp3, &inner, expr_loc, msg, diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc index a947dd1ba0b..ac8bd2d4812 100644 --- a/gcc/fortran/trans-intrinsic.cc +++ b/gcc/fortran/trans-intrinsic.cc @@ -5332,9 +5332,30 @@ strip_kind_from_actual (gfc_actual_arglist * actual) if (a[S] < limit) { limit = a[S]; pos = S + (1 - from); } S++; } - B) ARRAY has rank 1, and DIM is absent. Use the same code as the scalar + B: ARRAY has rank 1, and DIM is absent. Use the same code as the scalar case and wrap the result in an array. - C) Otherwise, a call is generated + C: ARRAY has rank > 1, NANs are not supported, and DIM and MASK are absent. + Generate code similar to the single loop scalar case, but using one + variable per dimension, for example if ARRAY has rank 2: + 4) NAN's aren't supported, no MASK: + limit = infinities_supported ? Infinity : huge (limit); + pos0 = (from0 <= to0 && from1 <= to1) ? 1 : 0; + pos1 = (from0 <= to0 && from1 <= to1) ? 1 : 0; + S1 = from1; + while (S1 <= to1) { + S0 = from0; + while (S0 <= to0) { + if (a[S1][S0] < limit) { + limit = a[S1][S0]; + pos0 = S + (1 - from0); + pos1 = S + (1 - from1); + } + S0++; + } + S1++; + } + result = { pos0, pos1 }; + D: Otherwise, a call is generated. For 2) and 4), if mask is scalar, this all goes into a conditional, setting pos = 0; in the else branch. @@ -5348,8 +5369,8 @@ strip_kind_from_actual (gfc_actual_arglist * actual) if (cond) { .... - The optimizer is smart enough to move the condition out of the loop. - They are now marked as unlikely too for further speedup. */ + The optimizer is smart enough to move the condition out of the loop. + They are now marked as unlikely too for further speedup. */ static void gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) @@ -5364,7 +5385,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) tree cond; tree elsetmp; tree ifbody; - tree offset; + tree offset[GFC_MAX_DIMENSIONS]; tree nonempty; tree lab1, lab2; tree b_if, b_else; @@ -5379,7 +5400,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) gfc_expr *maskexpr; gfc_expr *backexpr; gfc_se backse; - tree pos; + tree pos[GFC_MAX_DIMENSIONS]; tree result_var = NULL_TREE; int n; bool optional_mask; @@ -5447,7 +5468,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) HOST_WIDE_INT_1); as.upper[0] = gfc_get_int_expr (gfc_index_integer_kind, &arrayexpr->where, - HOST_WIDE_INT_1); + arrayexpr->rank); tree array = gfc_get_nodesc_array_type (type, &as, PACKED_STATIC, true); @@ -5455,8 +5476,13 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) } /* Initialize the result. */ - pos = gfc_create_var (gfc_array_index_type, "pos"); - offset = gfc_create_var (gfc_array_index_type, "offset"); + for (int i = 0; i < arrayexpr->rank; i++) + { + pos[i] = gfc_create_var (gfc_array_index_type, + gfc_get_string ("pos%d", i)); + offset[i] = gfc_create_var (gfc_array_index_type, + gfc_get_string ("offset%d", i)); + } /* Walk the arguments. */ arrayss = gfc_walk_expr (arrayexpr); @@ -5573,10 +5599,26 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) loop.temp_dim = loop.dimen; gfc_conv_loop_setup (&loop, &expr->where); - gcc_assert (loop.dimen == 1); - if (nonempty == NULL && maskss == NULL && loop.from[0] && loop.to[0]) - nonempty = fold_build2_loc (input_location, LE_EXPR, logical_type_node, - loop.from[0], loop.to[0]); + if (nonempty == NULL && maskss == NULL) + { + nonempty = logical_true_node; + + for (int i = 0; i < loop.dimen; i++) + { + if (!(loop.from[i] && loop.to[i])) + { + nonempty = NULL; + break; + } + + tree tmp = fold_build2_loc (input_location, LE_EXPR, + logical_type_node, loop.from[i], + loop.to[i]); + + nonempty = fold_build2_loc (input_location, TRUTH_ANDIF_EXPR, + logical_type_node, nonempty, tmp); + } + } lab1 = NULL; lab2 = NULL; @@ -5586,14 +5628,18 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) is non-empty and no MASK is used, we can initialize to 1 to simplify the inner loop. */ if (nonempty != NULL && !HONOR_NANS (DECL_MODE (limit))) - gfc_add_modify (&loop.pre, pos, - fold_build3_loc (input_location, COND_EXPR, - gfc_array_index_type, - nonempty, gfc_index_one_node, - gfc_index_zero_node)); + { + tree init = fold_build3_loc (input_location, COND_EXPR, + gfc_array_index_type, nonempty, + gfc_index_one_node, + gfc_index_zero_node); + for (int i = 0; i < loop.dimen; i++) + gfc_add_modify (&loop.pre, pos[i], init); + } else { - gfc_add_modify (&loop.pre, pos, gfc_index_zero_node); + gcc_assert (loop.dimen == 1); + gfc_add_modify (&loop.pre, pos[0], gfc_index_zero_node); lab1 = gfc_build_label_decl (NULL_TREE); TREE_USED (lab1) = 1; lab2 = gfc_build_label_decl (NULL_TREE); @@ -5602,11 +5648,14 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) /* An offset must be added to the loop counter to obtain the required position. */ - gcc_assert (loop.from[0]); + for (int i = 0; i < loop.dimen; i++) + { + gcc_assert (loop.from[i]); - tmp = fold_build2_loc (input_location, MINUS_EXPR, gfc_array_index_type, - gfc_index_one_node, loop.from[0]); - gfc_add_modify (&loop.pre, offset, tmp); + tmp = fold_build2_loc (input_location, MINUS_EXPR, gfc_array_index_type, + gfc_index_one_node, loop.from[i]); + gfc_add_modify (&loop.pre, offset[i], tmp); + } gfc_mark_ss_chain_used (arrayss, lab1 ? 3 : 1); if (maskss) @@ -5647,20 +5696,23 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) tree ifbody2; gfc_start_block (&ifblock2); - tmp = fold_build2_loc (input_location, PLUS_EXPR, TREE_TYPE (pos), - loop.loopvar[0], offset); - gfc_add_modify (&ifblock2, pos, tmp); + tmp = fold_build2_loc (input_location, PLUS_EXPR, TREE_TYPE (pos[0]), + loop.loopvar[0], offset[0]); + gfc_add_modify (&ifblock2, pos[0], tmp); ifbody2 = gfc_finish_block (&ifblock2); - cond = fold_build2_loc (input_location, EQ_EXPR, logical_type_node, pos, - gfc_index_zero_node); + cond = fold_build2_loc (input_location, EQ_EXPR, logical_type_node, + pos[0], gfc_index_zero_node); tmp = build3_v (COND_EXPR, cond, ifbody2, build_empty_stmt (input_location)); gfc_add_expr_to_block (&block, tmp); } - tmp = fold_build2_loc (input_location, PLUS_EXPR, TREE_TYPE (pos), - loop.loopvar[0], offset); - gfc_add_modify (&ifblock, pos, tmp); + for (int i = 0; i < loop.dimen; i++) + { + tmp = fold_build2_loc (input_location, PLUS_EXPR, TREE_TYPE (pos[i]), + loop.loopvar[i], offset[i]); + gfc_add_modify (&ifblock, pos[i], tmp); + } if (lab1) gfc_add_expr_to_block (&ifblock, build1_v (GOTO_EXPR, lab1)); @@ -5724,13 +5776,15 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) if (lab1) { + gcc_assert (loop.dimen == 1); + gfc_trans_scalarized_loop_boundary (&loop, &body); if (HONOR_NANS (DECL_MODE (limit))) { if (nonempty != NULL) { - ifbody = build2_v (MODIFY_EXPR, pos, gfc_index_one_node); + ifbody = build2_v (MODIFY_EXPR, pos[0], gfc_index_one_node); tmp = build3_v (COND_EXPR, nonempty, ifbody, build_empty_stmt (input_location)); gfc_add_expr_to_block (&loop.code[0], tmp); @@ -5767,9 +5821,9 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) /* Assign the value to the limit... */ gfc_add_modify (&ifblock, limit, arrayse.expr); - tmp = fold_build2_loc (input_location, PLUS_EXPR, TREE_TYPE (pos), - loop.loopvar[0], offset); - gfc_add_modify (&ifblock, pos, tmp); + tmp = fold_build2_loc (input_location, PLUS_EXPR, TREE_TYPE (pos[0]), + loop.loopvar[0], offset[0]); + gfc_add_modify (&ifblock, pos[0], tmp); ifbody = gfc_finish_block (&ifblock); @@ -5832,6 +5886,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) /* For a scalar mask, enclose the loop in an if statement. */ if (maskexpr && maskss == NULL) { + gcc_assert (loop.dimen == 1); tree ifmask; gfc_init_se (&maskse, NULL); @@ -5846,7 +5901,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) the pos variable the same way as above. */ gfc_init_block (&elseblock); - gfc_add_modify (&elseblock, pos, gfc_index_zero_node); + gfc_add_modify (&elseblock, pos[0], gfc_index_zero_node); elsetmp = gfc_finish_block (&elseblock); ifmask = conv_mask_condition (&maskse, maskexpr, optional_mask); tmp = build3_v (COND_EXPR, ifmask, tmp, elsetmp); @@ -5860,18 +5915,22 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) } gfc_cleanup_loop (&loop); - tree value = convert (type, pos); if (expr->rank > 0) { - tree res_arr_ref = gfc_build_array_ref (result_var, gfc_index_zero_node, - NULL_TREE, true); + for (int i = 0; i < arrayexpr->rank; i++) + { + tree res_idx = build_int_cst (gfc_array_index_type, i); + tree res_arr_ref = gfc_build_array_ref (result_var, res_idx, + NULL_TREE, true); - gfc_add_modify (&se->pre, res_arr_ref, value); + tree value = convert (type, pos[i]); + gfc_add_modify (&se->pre, res_arr_ref, value); + } se->expr = result_var; } else - se->expr = value; + se->expr = convert (type, pos[0]); } /* Emit code for findloc. */ @@ -11722,8 +11781,12 @@ gfc_inline_intrinsic_function_p (gfc_expr *expr) return false; gfc_actual_arglist *array_arg = expr->value.function.actual; + gfc_actual_arglist *dim_arg = array_arg->next; + gfc_actual_arglist *mask_arg = dim_arg->next; gfc_expr *array = array_arg->expr; + gfc_expr *dim = dim_arg->expr; + gfc_expr *mask = mask_arg->expr; if (!(array->ts.type == BT_INTEGER || array->ts.type == BT_REAL)) @@ -11732,6 +11795,11 @@ gfc_inline_intrinsic_function_p (gfc_expr *expr) if (array->rank == 1) return true; + if (array->ts.type == BT_INTEGER + && dim == nullptr + && mask == nullptr) + return true; + return false; } diff --git a/gcc/testsuite/gfortran.dg/maxloc_bounds_4.f90 b/gcc/testsuite/gfortran.dg/maxloc_bounds_4.f90 index b1c7ca752d0..17f6cd86dc2 100644 --- a/gcc/testsuite/gfortran.dg/maxloc_bounds_4.f90 +++ b/gcc/testsuite/gfortran.dg/maxloc_bounds_4.f90 @@ -1,6 +1,6 @@ ! { dg-do run } ! { dg-options "-fbounds-check" } -! { dg-shouldfail "Incorrect extent in return value of MAXLOC intrinsic: is 3, should be 2" } +! { dg-shouldfail "Incorrect extent in return value of MAXLOC intrinsic: is 3, should be 2|Array bound mismatch for dimension 1 of array 'res' .3/2." } module tst contains subroutine foo(res) @@ -18,4 +18,4 @@ program main integer :: res(3) call foo(res) end program main -! { dg-output "Fortran runtime error: Incorrect extent in return value of MAXLOC intrinsic: is 3, should be 2" } +! { dg-output "Fortran runtime error: Incorrect extent in return value of MAXLOC intrinsic: is 3, should be 2|Array bound mismatch for dimension 1 of array 'res' .3/2." } From patchwork Wed Jul 31 20:07:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikael Morin X-Patchwork-Id: 1967294 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=orange.fr header.i=@orange.fr header.a=rsa-sha256 header.s=t20230301 header.b=AVFxX75S; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WZ3Bc4QG6z1ybX for ; Thu, 1 Aug 2024 06:11:04 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D6A75385C6C7 for ; Wed, 31 Jul 2024 20:11:02 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from msa.smtpout.orange.fr (smtp-81.smtpout.orange.fr [80.12.242.81]) by sourceware.org (Postfix) with ESMTPS id D0BC53858416; Wed, 31 Jul 2024 20:07:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D0BC53858416 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=orange.fr Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=orange.fr ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D0BC53858416 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=80.12.242.81 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722456468; cv=none; b=FrvP/DzxjUj1tn/byA3LUn8/5ozrybCO2acxrwoXptuPraVXEqFAsim5mU7z4QX2PJKUN2it6gG3B96YMgMlxrEyr1bhvt95g0vNmWRFicwTVPTBiOTF8CuIiPDntZn0anm7pofK1gYrIhAyTe54k4GOBOORzlb4tnJcsgZI3LM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722456468; c=relaxed/simple; bh=md7Y3Z1dP6zJ+SgQHeEk4YzQnxbZ3jBjk5RJ2rDR84w=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=VXbljoULKcewfOg1TaItWHzSCKQjfr0unb/OQTtniFUgXzrQPiax9UdlwHS5ByuxbQDLnHe/5f9ZS2o6iWZtcGfBM+MQZi6lBq8zcdI7bysA4YelpAGIcjn+zo+YIXyOuTe/fMV6WrjXrx5vfVCO9qZdYxSQCM1mm+sj9eNwMK4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from cyrano.home ([86.215.161.51]) by smtp.orange.fr with ESMTPA id ZFbisBHiIYjQzZFbrsqYRX; Wed, 31 Jul 2024 22:07:43 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=orange.fr; s=t20230301; t=1722456463; bh=JGpWMiS+AYIaivEzzifSB9uohQv3HHWwYu5QCO7ju9o=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=AVFxX75S+jSvSosG1cE7lb19bDjrmNH/Xd71HrAr9C/KDt8hZhsP5dqq6EmHuIw/w R5ffm2jz/K/JmPOjaC+aRiHLJtAPwLeTy8RaRlfEAn/RGKI24wVyUGVC1T7/252456 uqKzuLq7xfKgHznq8oJvzbRiOgWkguvK9yeUEvXZSk8HKZ8TfWInsoxML+6YLl/GHv VrgXNiL4AW/1Shw1TH1TtwDKxw8AGExtziyasdwX1++lgsuI1eAhY/kbIGoA09r5G/ YDHsE2uSxtpku/oqN0v4yzL/IyLhHAcTe+K5xlnmpCJ3W4pxbM2SZruWeOl8Es7xB4 aLx+UWKl5yKkw== X-ME-Helo: cyrano.home X-ME-Auth: bW9yaW4tbWlrYWVsQG9yYW5nZS5mcg== X-ME-Date: Wed, 31 Jul 2024 22:07:43 +0200 X-ME-IP: 86.215.161.51 From: Mikael Morin To: gcc-patches@gcc.gnu.org, fortran@gcc.gnu.org Subject: [PATCH 6/8] fortran: Inline integral MINLOC/MAXLOC with no DIM and scalar MASK [PR90608] Date: Wed, 31 Jul 2024 22:07:33 +0200 Message-ID: <20240731200735.229898-7-morin-mikael@orange.fr> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240731200735.229898-1-morin-mikael@orange.fr> References: <20240731200735.229898-1-morin-mikael@orange.fr> MIME-Version: 1.0 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org From: Mikael Morin Regression-tested on x86_64-pc-linux-gnu. OK for master? -- >8 -- Enable the generation of inline code for MINLOC/MAXLOC when argument ARRAY is of integral type, DIM is not present, and MASK is present and is scalar (only absent MASK or rank 1 ARRAY were inlined before). Scalar masks are implemented with a wrapping condition around the code one would generate if MASK wasn't present, so they are easy to support once inline code without MASK is working. PR fortran/90608 gcc/fortran/ChangeLog: * trans-intrinsic.cc (gfc_conv_intrinsic_minmaxloc): Generate variable initialization for each dimension in the else branch of the toplevel condition. (gfc_inline_intrinsic_function_p): Return TRUE for scalar MASK. gcc/testsuite/ChangeLog: * gfortran.dg/maxloc_bounds_7.f90: Additionally accept the error message reported by the scalarizer. --- gcc/fortran/trans-intrinsic.cc | 13 ++++++++----- gcc/testsuite/gfortran.dg/maxloc_bounds_7.f90 | 4 ++-- 2 files changed, 10 insertions(+), 7 deletions(-) diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc index ac8bd2d4812..85520871797 100644 --- a/gcc/fortran/trans-intrinsic.cc +++ b/gcc/fortran/trans-intrinsic.cc @@ -5886,7 +5886,6 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) /* For a scalar mask, enclose the loop in an if statement. */ if (maskexpr && maskss == NULL) { - gcc_assert (loop.dimen == 1); tree ifmask; gfc_init_se (&maskse, NULL); @@ -5901,7 +5900,8 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) the pos variable the same way as above. */ gfc_init_block (&elseblock); - gfc_add_modify (&elseblock, pos[0], gfc_index_zero_node); + for (int i = 0; i < loop.dimen; i++) + gfc_add_modify (&elseblock, pos[i], gfc_index_zero_node); elsetmp = gfc_finish_block (&elseblock); ifmask = conv_mask_condition (&maskse, maskexpr, optional_mask); tmp = build3_v (COND_EXPR, ifmask, tmp, elsetmp); @@ -11795,9 +11795,12 @@ gfc_inline_intrinsic_function_p (gfc_expr *expr) if (array->rank == 1) return true; - if (array->ts.type == BT_INTEGER - && dim == nullptr - && mask == nullptr) + if (array->ts.type != BT_INTEGER + || dim != nullptr) + return false; + + if (mask == nullptr + || mask->rank == 0) return true; return false; diff --git a/gcc/testsuite/gfortran.dg/maxloc_bounds_7.f90 b/gcc/testsuite/gfortran.dg/maxloc_bounds_7.f90 index 206a29b149d..3aa9d3dcebe 100644 --- a/gcc/testsuite/gfortran.dg/maxloc_bounds_7.f90 +++ b/gcc/testsuite/gfortran.dg/maxloc_bounds_7.f90 @@ -1,6 +1,6 @@ ! { dg-do run } ! { dg-options "-fbounds-check" } -! { dg-shouldfail "Incorrect extent in return value of MAXLOC intrinsic: is 3, should be 2" } +! { dg-shouldfail "Incorrect extent in return value of MAXLOC intrinsic: is 3, should be 2|Array bound mismatch for dimension 1 of array 'res' .3/2." } module tst contains subroutine foo(res) @@ -18,4 +18,4 @@ program main integer :: res(3) call foo(res) end program main -! { dg-output "Fortran runtime error: Incorrect extent in return value of MAXLOC intrinsic: is 3, should be 2" } +! { dg-output "Fortran runtime error: Incorrect extent in return value of MAXLOC intrinsic: is 3, should be 2|Array bound mismatch for dimension 1 of array 'res' .3/2." } From patchwork Wed Jul 31 20:07:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikael Morin X-Patchwork-Id: 1967288 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=orange.fr header.i=@orange.fr header.a=rsa-sha256 header.s=t20230301 header.b=FgE087Xz; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WZ380302dz1ybb for ; Thu, 1 Aug 2024 06:08:48 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1EBC3385DDC6 for ; Wed, 31 Jul 2024 20:08:46 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp.smtpout.orange.fr (smtp-30.smtpout.orange.fr [80.12.242.30]) by sourceware.org (Postfix) with ESMTPS id D4F993858424; Wed, 31 Jul 2024 20:07:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D4F993858424 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=orange.fr Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=orange.fr ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D4F993858424 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=80.12.242.30 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722456468; cv=none; b=lD7vtmSltnIbIUrtzPZlJKQjbq4op/6iPLwVNlOgp4f/RWsyxdKn200rH5bkHZqzIvc+jT2fcJctsq/7SE7St9tBE3nzCN0h9bhBmgzLNO7EFrdoSONwtGYTa7JBbouuCe3XWP0AQDW/G+xNourBKI33sjItF6mnAxYViiUoz8Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722456468; c=relaxed/simple; bh=uyB3betHrsQtV6hpwW9WeUDvDsU9GBsV7NY8oS3eTgw=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=xnEGzUMdz2E8eIleTJTNysa/K+E3VzVeEdrVUmaRSuLzgOcxDzH5mPYzq0bIE5+79S3BtVfbQNuKg5QDMaeVZRmOtInappjKhcMxBtCpAQBIn5UhMiagiXXf/4AbyqX3V4VdW1sXS1WYaeU+HfCdWJzfEiRDj25wQntVWYj45HA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from cyrano.home ([86.215.161.51]) by smtp.orange.fr with ESMTPA id ZFbisBHiIYjQzZFbrsqYRZ; Wed, 31 Jul 2024 22:07:44 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=orange.fr; s=t20230301; t=1722456464; bh=jerLWomcCsoFK+5cVm/QFodbL9r4jacLZYn2TE/2uP4=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=FgE087Xzi7/XDfxBzx+HBDceWDiXvCwuz0lCtkgLX9e/QiTw/uRU0863x9YKBkOBp GTeLUsnXNjBc5g3qRBeaj5wRrzqS3TXlfzxLHyUk0V3wx0Rrtl32n1Tko0wgMmnscH DnXdd60NfaTeEgZT9Pak4njLQrSSwedb3TSg5v/KsLXREe4rOHSzK7A6lKQknVTZkj aL1U2uNkB6LgXXqfAUF39oJ0PuyBctw8izsaBq4mC6MM9Rj2L2aevFJ8+mvRqWtZ9y 12UO1SCl5JcvFOOxOZ6yAjdGF4RBQFUul98V/wFS80a5Wjw2Cqg0dpYGyB2T9l94/Q /r5VslgxFpWJQ== X-ME-Helo: cyrano.home X-ME-Auth: bW9yaW4tbWlrYWVsQG9yYW5nZS5mcg== X-ME-Date: Wed, 31 Jul 2024 22:07:44 +0200 X-ME-IP: 86.215.161.51 From: Mikael Morin To: gcc-patches@gcc.gnu.org, fortran@gcc.gnu.org Subject: [PATCH 7/8] fortran: Inline non-character MINLOC/MAXLOC with no DIM [PR90608] Date: Wed, 31 Jul 2024 22:07:34 +0200 Message-ID: <20240731200735.229898-8-morin-mikael@orange.fr> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240731200735.229898-1-morin-mikael@orange.fr> References: <20240731200735.229898-1-morin-mikael@orange.fr> MIME-Version: 1.0 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org From: Mikael Morin Regression-tested on x86_64-pc-linux-gnu. OK for master? -- >8 -- Enable generation of inline MINLOC/MAXLOC code in the case where DIM is not present, and either ARRAY is of floating point type or MASK is an array. Those cases are the remaining bits to fully support inlining of non-CHARACTER MINLOC/MAXLOC without DIM. They are treated together because they generate similar code, the NANs for REAL types being handled a bit like a second level of masking. These are the cases for which we generate two sets of loops. This change affects the code generating the second loop, that was previously accessible only in the cases ARRAY has rank 1 only. The single variable initialization and update are changed to apply to multiple variables, one per dimension. The code generated is as follows (if ARRAY has rank 2): for (idx11 in lower1..upper1) { for (idx12 in lower2..upper2) { ... if (...) { ... goto second_loop; } } } second_loop: for (idx21 in lower1..upper1) { for (idx22 in lower2..upper2) { ... } } This code leads to processing the first elements redundantly, both in the first set of loops and in the second one. The loop over idx22 could start from idx12 the first time it is run, but as it has to start from lower2 for the rest of the runs, this change uses the same bounds for both set of loops for simplicity. In the rank 1 case, this makes the generated code worse compared to the inline code that was generated before. A later change will introduce conditionals to avoid the duplicate processing and restore the generated code in that case. PR fortran/90608 gcc/fortran/ChangeLog: * trans-intrinsic.cc (gfc_conv_intrinsic_minmaxloc): Initialize and update all the variables. Put the label and goto in the outermost scalarizer loop. Don't start the second loop where the first stopped. (gfc_inline_intrinsic_function_p): Also return TRUE for array MASK or for any REAL type. gcc/testsuite/ChangeLog: * gfortran.dg/maxloc_bounds_5.f90: Additionally accept error messages reported by the scalarizer. * gfortran.dg/maxloc_bounds_6.f90: Ditto. --- gcc/fortran/trans-intrinsic.cc | 127 ++++++++++++------ gcc/testsuite/gfortran.dg/maxloc_bounds_5.f90 | 4 +- gcc/testsuite/gfortran.dg/maxloc_bounds_6.f90 | 4 +- 3 files changed, 87 insertions(+), 48 deletions(-) diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc index 85520871797..3a6a73d4241 100644 --- a/gcc/fortran/trans-intrinsic.cc +++ b/gcc/fortran/trans-intrinsic.cc @@ -5332,12 +5332,55 @@ strip_kind_from_actual (gfc_actual_arglist * actual) if (a[S] < limit) { limit = a[S]; pos = S + (1 - from); } S++; } - B: ARRAY has rank 1, and DIM is absent. Use the same code as the scalar - case and wrap the result in an array. - C: ARRAY has rank > 1, NANs are not supported, and DIM and MASK are absent. - Generate code similar to the single loop scalar case, but using one - variable per dimension, for example if ARRAY has rank 2: - 4) NAN's aren't supported, no MASK: + B: Array result, non-CHARACTER type, DIM absent + Generate similar code as in the scalar case, using a collection of + variables (one per dimension) instead of a single variable as result. + Picking only cases 1) and 4) with ARRAY of rank 2, the generated code + becomes: + 1) Array mask is used and NaNs need to be supported: + limit = Infinity; + pos0 = 0; + pos1 = 0; + S1 = from1; + while (S1 <= to1) { + S0 = from0; + while (s0 <= to0 { + if (mask[S1][S0]) { + if (pos0 == 0) { + pos0 = S0 + (1 - from0); + pos1 = S1 + (1 - from1); + } + if (a[S1][S0] <= limit) { + limit = a[S1][S0]; + pos0 = S0 + (1 - from0); + pos1 = S1 + (1 - from1); + goto lab1; + } + } + S0++; + } + S1++; + } + goto lab2; + lab1:; + S1 = from1; + while (S1 <= to1) { + S0 = from0; + while (S0 <= to0) { + if (mask[S1][S0]) + if (a[S1][S0] < limit) { + limit = a[S1][S0]; + pos0 = S + (1 - from0); + pos1 = S + (1 - from1); + } + S0++; + } + S1++; + } + lab2:; + result = { pos0, pos1 }; + ... + 4) NANs aren't supported, no array mask. limit = infinities_supported ? Infinity : huge (limit); pos0 = (from0 <= to0 && from1 <= to1) ? 1 : 0; pos1 = (from0 <= to0 && from1 <= to1) ? 1 : 0; @@ -5355,7 +5398,7 @@ strip_kind_from_actual (gfc_actual_arglist * actual) S1++; } result = { pos0, pos1 }; - D: Otherwise, a call is generated. + C: Otherwise, a call is generated. For 2) and 4), if mask is scalar, this all goes into a conditional, setting pos = 0; in the else branch. @@ -5584,18 +5627,11 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) /* The code generated can have more than one loop in sequence (see the comment at the function header). This doesn't work well with the scalarizer, which changes arrays' offset when the scalarization loops - are generated (see gfc_trans_preloop_setup). Fortunately, {min,max}loc - are currently inlined in the scalar case only (for which loop is of rank - one). As there is no dependency to care about in that case, there is no - temporary, so that we can use the scalarizer temporary code to handle - multiple loops. Thus, we set temp_dim here, we call gfc_mark_ss_chain_used - with flag=3 later, and we use gfc_trans_scalarized_loop_boundary even later - to restore offset. - TODO: this prevents inlining of rank > 0 minmaxloc calls, so this - should eventually go away. We could either create two loops properly, - or find another way to save/restore the array offsets between the two - loops (without conflicting with temporary management), or use a single - loop minmaxloc implementation. See PR 31067. */ + are generated (see gfc_trans_preloop_setup). Fortunately, we can use + the scalarizer temporary code to handle multiple loops. Thus, we set + temp_dim here, we call gfc_mark_ss_chain_used with flag=3 later, and + we use gfc_trans_scalarized_loop_boundary even later to restore + offset. */ loop.temp_dim = loop.dimen; gfc_conv_loop_setup (&loop, &expr->where); @@ -5638,8 +5674,8 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) } else { - gcc_assert (loop.dimen == 1); - gfc_add_modify (&loop.pre, pos[0], gfc_index_zero_node); + for (int i = 0; i < loop.dimen; i++) + gfc_add_modify (&loop.pre, pos[i], gfc_index_zero_node); lab1 = gfc_build_label_decl (NULL_TREE); TREE_USED (lab1) = 1; lab2 = gfc_build_label_decl (NULL_TREE); @@ -5696,10 +5732,14 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) tree ifbody2; gfc_start_block (&ifblock2); - tmp = fold_build2_loc (input_location, PLUS_EXPR, TREE_TYPE (pos[0]), - loop.loopvar[0], offset[0]); - gfc_add_modify (&ifblock2, pos[0], tmp); + for (int i = 0; i < loop.dimen; i++) + { + tmp = fold_build2_loc (input_location, PLUS_EXPR, TREE_TYPE (pos[i]), + loop.loopvar[i], offset[i]); + gfc_add_modify (&ifblock2, pos[i], tmp); + } ifbody2 = gfc_finish_block (&ifblock2); + cond = fold_build2_loc (input_location, EQ_EXPR, logical_type_node, pos[0], gfc_index_zero_node); tmp = build3_v (COND_EXPR, cond, ifbody2, @@ -5776,23 +5816,29 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) if (lab1) { - gcc_assert (loop.dimen == 1); - gfc_trans_scalarized_loop_boundary (&loop, &body); + stmtblock_t * const outer_block = &loop.code[loop.dimen - 1]; + if (HONOR_NANS (DECL_MODE (limit))) { if (nonempty != NULL) { - ifbody = build2_v (MODIFY_EXPR, pos[0], gfc_index_one_node); + stmtblock_t init_block; + gfc_init_block (&init_block); + + for (int i = 0; i < loop.dimen; i++) + gfc_add_modify (&init_block, pos[i], gfc_index_one_node); + + tree ifbody = gfc_finish_block (&init_block); tmp = build3_v (COND_EXPR, nonempty, ifbody, build_empty_stmt (input_location)); - gfc_add_expr_to_block (&loop.code[0], tmp); + gfc_add_expr_to_block (outer_block, tmp); } } - gfc_add_expr_to_block (&loop.code[0], build1_v (GOTO_EXPR, lab2)); - gfc_add_expr_to_block (&loop.code[0], build1_v (LABEL_EXPR, lab1)); + gfc_add_expr_to_block (outer_block, build1_v (GOTO_EXPR, lab2)); + gfc_add_expr_to_block (outer_block, build1_v (LABEL_EXPR, lab1)); /* If we have a mask, only check this element if the mask is set. */ if (maskss) @@ -5821,9 +5867,12 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) /* Assign the value to the limit... */ gfc_add_modify (&ifblock, limit, arrayse.expr); - tmp = fold_build2_loc (input_location, PLUS_EXPR, TREE_TYPE (pos[0]), - loop.loopvar[0], offset[0]); - gfc_add_modify (&ifblock, pos[0], tmp); + for (int i = 0; i < loop.dimen; i++) + { + tmp = fold_build2_loc (input_location, PLUS_EXPR, TREE_TYPE (pos[i]), + loop.loopvar[i], offset[i]); + gfc_add_modify (&ifblock, pos[i], tmp); + } ifbody = gfc_finish_block (&ifblock); @@ -5873,9 +5922,6 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) else tmp = gfc_finish_block (&block); gfc_add_expr_to_block (&body, tmp); - /* Avoid initializing loopvar[0] again, it should be left where - it finished by the first loop. */ - loop.from[0] = loop.loopvar[0]; } gfc_trans_scalarizing_loops (&loop, &body); @@ -11782,11 +11828,9 @@ gfc_inline_intrinsic_function_p (gfc_expr *expr) gfc_actual_arglist *array_arg = expr->value.function.actual; gfc_actual_arglist *dim_arg = array_arg->next; - gfc_actual_arglist *mask_arg = dim_arg->next; gfc_expr *array = array_arg->expr; gfc_expr *dim = dim_arg->expr; - gfc_expr *mask = mask_arg->expr; if (!(array->ts.type == BT_INTEGER || array->ts.type == BT_REAL)) @@ -11795,12 +11839,7 @@ gfc_inline_intrinsic_function_p (gfc_expr *expr) if (array->rank == 1) return true; - if (array->ts.type != BT_INTEGER - || dim != nullptr) - return false; - - if (mask == nullptr - || mask->rank == 0) + if (dim == nullptr) return true; return false; diff --git a/gcc/testsuite/gfortran.dg/maxloc_bounds_5.f90 b/gcc/testsuite/gfortran.dg/maxloc_bounds_5.f90 index ad93d238e74..071c1c37868 100644 --- a/gcc/testsuite/gfortran.dg/maxloc_bounds_5.f90 +++ b/gcc/testsuite/gfortran.dg/maxloc_bounds_5.f90 @@ -1,6 +1,6 @@ ! { dg-do run } ! { dg-options "-fbounds-check" } -! { dg-shouldfail "Incorrect extent in return value of MAXLOC intrinsic: is 3, should be 2" } +! { dg-shouldfail "Incorrect extent in return value of MAXLOC intrinsic: is 3, should be 2|Array bound mismatch for dimension 1 of array 'res' .3/2." } module tst contains subroutine foo(res) @@ -18,4 +18,4 @@ program main integer :: res(3) call foo(res) end program main -! { dg-output "Fortran runtime error: Incorrect extent in return value of MAXLOC intrinsic: is 3, should be 2" } +! { dg-output "Fortran runtime error: Incorrect extent in return value of MAXLOC intrinsic: is 3, should be 2|Array bound mismatch for dimension 1 of array 'res' .3/2." } diff --git a/gcc/testsuite/gfortran.dg/maxloc_bounds_6.f90 b/gcc/testsuite/gfortran.dg/maxloc_bounds_6.f90 index 3a63418aef3..0ce0bfcb70c 100644 --- a/gcc/testsuite/gfortran.dg/maxloc_bounds_6.f90 +++ b/gcc/testsuite/gfortran.dg/maxloc_bounds_6.f90 @@ -1,6 +1,6 @@ ! { dg-do run } ! { dg-options "-fbounds-check" } -! { dg-shouldfail "Incorrect extent in MASK argument of MAXLOC intrinsic in dimension 2: is 3, should be 2" } +! { dg-shouldfail "Incorrect extent in MASK argument of MAXLOC intrinsic in dimension 2: is 3, should be 2|Array bound mismatch for dimension 2 of array 'm' .3/2." } program main integer(kind=4), allocatable :: f(:,:) logical, allocatable :: m(:,:) @@ -12,4 +12,4 @@ program main res = maxloc(f,mask=m) write(line,fmt='(80I1)') res end program main -! { dg-output "Fortran runtime error: Incorrect extent in MASK argument of MAXLOC intrinsic in dimension 2: is 3, should be 2" } +! { dg-output "Fortran runtime error: Incorrect extent in MASK argument of MAXLOC intrinsic in dimension 2: is 3, should be 2|Array bound mismatch for dimension 2 of array 'm' .3/2." } From patchwork Wed Jul 31 20:07:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mikael Morin X-Patchwork-Id: 1967293 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=orange.fr header.i=@orange.fr header.a=rsa-sha256 header.s=t20230301 header.b=QWc5H9pl; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WZ3BR2v6xz1ybX for ; Thu, 1 Aug 2024 06:10:55 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 73D93385DDC4 for ; Wed, 31 Jul 2024 20:10:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from msa.smtpout.orange.fr (smtp-82.smtpout.orange.fr [80.12.242.82]) by sourceware.org (Postfix) with ESMTPS id 33C773858420; Wed, 31 Jul 2024 20:07:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 33C773858420 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=orange.fr Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=orange.fr ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 33C773858420 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=80.12.242.82 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722456469; cv=none; b=Q8Sgmi6TZc9r32EZSf01DPQ3vG9XNEpTRhiRgcaiJwzzLC6bo5ZoFktJmdl9VNokbSS7HGPOMn/uNwunuM+Rf9FNaS1uAJ4dqp//xpf8HvoH+GmQ65ZEcfdH/OCk9dcRV18QOPFk/LSRCNysLglh1uehC3rdCIRwPw4kUVqf14Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1722456469; c=relaxed/simple; bh=ZUpnrViA1KHnHXA11yIo7/5Vt+e0nM934pcPRtbOOis=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=M+xybOrvmlBhXkBDu1yN4Bx+mVK9lNU9p5fnauV731SzGBvcOSYKgamM0CGPKtaq+oBVZYz8TQS6aoAqin7iTRqNGF0W8nBNjseRpxZ6QUjyPIGgHYIKnb4ktCw2z1iAQHOTnHcTdgoLTMuI6XCZZRYZG2vs13lzGuOHGyHAhBs= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from cyrano.home ([86.215.161.51]) by smtp.orange.fr with ESMTPA id ZFbisBHiIYjQzZFbssqYRa; Wed, 31 Jul 2024 22:07:44 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=orange.fr; s=t20230301; t=1722456464; bh=omsJ5Ae8/zBPXQdi7Kj5BmBn6NzPg8J1sIEghfkp5+s=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=QWc5H9plMTtLdk4pWGHFQhQlO7XAvBVq1RFIdfyFFpoKQMSyAKc41xJaQPAX/hTMJ 6Mc3dQvjr08yJRUTGHh10a6YPn6wK7a4Xpw/dFq7ShYQSUjbiTc6DrPEGDmax+ei5t cOgAvbNramgT148uFKFfnBL0KfwS2ziwdWBjQ203FuImUZ7aI2sWsdjdqYkvUMhUPo u9iDWPWL3IrQ+Qi5wx7rMAc3ACjGmiiVQmU0PgYCSDMwDb7wIA1zActpIZA13Jbv4I MZr/dhmNFZnISOPwY/xvelDOaVR9jT/btm1a1Zmr+ASDrEeZFua1mdJg9KduEyRxtC EE9WBLh/ioheg== X-ME-Helo: cyrano.home X-ME-Auth: bW9yaW4tbWlrYWVsQG9yYW5nZS5mcg== X-ME-Date: Wed, 31 Jul 2024 22:07:44 +0200 X-ME-IP: 86.215.161.51 From: Mikael Morin To: gcc-patches@gcc.gnu.org, fortran@gcc.gnu.org Subject: [PATCH 8/8] fortran: Continue MINLOC/MAXLOC second loop where the first stopped [PR90608] Date: Wed, 31 Jul 2024 22:07:35 +0200 Message-ID: <20240731200735.229898-9-morin-mikael@orange.fr> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240731200735.229898-1-morin-mikael@orange.fr> References: <20240731200735.229898-1-morin-mikael@orange.fr> MIME-Version: 1.0 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org From: Mikael Morin Regression-tested on x86_64-pc-linux-gnu. OK for master? -- >8 -- Continue the second set of loops where the first one stopped in the generated inline MINLOC/MAXLOC code in the cases where the generated code contains two sets of loops. This fixes a regression that was introduced when enabling the generation of inline MINLOC/MAXLOC code with ARRAY of rank greater than 1, no DIM argument, and either non-scalar MASK or floating- point ARRAY. In the cases where two sets of loops are generated as inline MINLOC/MAXLOC code, we previously generated code such as (for rank 2 ARRAY, so with two levels of nesting): for (idx11 in lower1..upper1) { for (idx12 in lower2..upper2) { ... if (...) { ... goto second_loop; } } } second_loop: for (idx21 in lower1..upper1) { for (idx22 in lower2..upper2) { ... } } which means we process the first elements twice, once in the first set of loops and once in the second one. This change avoids this duplicate processing by using a conditional as lower bound for the second set of loops, generating code like: second_loop_entry = false; for (idx11 in lower1..upper1) { for (idx12 in lower2..upper2) { ... if (...) { ... second_loop_entry = true; goto second_loop; } } } second_loop: for (idx21 in (second_loop_entry ? idx11 : lower1)..upper1) { for (idx22 in (second_loop_entry ? idx12 : lower2)..upper2) { ... second_loop_entry = false; } } It was expected that the compiler optimizations would be able to remove the state variable second_loop_entry. It is the case if ARRAY has rank 1 (so without loop nesting), the variable is removed and the loop bounds become unconditional, which restores previously generated code, fully fixing the regression. For larger rank, unfortunately, the state variable and conditional loop bounds remain, but those cases were previously using library calls, so it's not a regression. PR fortran/90608 gcc/fortran/ChangeLog: * trans-intrinsic.cc (gfc_conv_intrinsic_minmaxloc): Generate a set of index variables. Set them using the loop indexes before leaving the first set of loops. Generate a new loop entry predicate. Initialize it. Set it before leaving the first set of loops. Clear it in the body of the second set of loops. For the second set of loops, update each loop lower bound to use the corresponding index variable if the predicate variable is set. --- gcc/fortran/trans-intrinsic.cc | 33 +++++++++++++++++++++++++++++++-- 1 file changed, 31 insertions(+), 2 deletions(-) diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc index 3a6a73d4241..89134b1190b 100644 --- a/gcc/fortran/trans-intrinsic.cc +++ b/gcc/fortran/trans-intrinsic.cc @@ -5342,6 +5342,7 @@ strip_kind_from_actual (gfc_actual_arglist * actual) pos0 = 0; pos1 = 0; S1 = from1; + second_loop_entry = false; while (S1 <= to1) { S0 = from0; while (s0 <= to0 { @@ -5354,6 +5355,7 @@ strip_kind_from_actual (gfc_actual_arglist * actual) limit = a[S1][S0]; pos0 = S0 + (1 - from0); pos1 = S1 + (1 - from1); + second_loop_entry = true; goto lab1; } } @@ -5363,9 +5365,9 @@ strip_kind_from_actual (gfc_actual_arglist * actual) } goto lab2; lab1:; - S1 = from1; + S1 = second_loop_entry ? S1 : from1; while (S1 <= to1) { - S0 = from0; + S0 = second_loop_entry ? S0 : from0; while (S0 <= to0) { if (mask[S1][S0]) if (a[S1][S0] < limit) { @@ -5373,6 +5375,7 @@ strip_kind_from_actual (gfc_actual_arglist * actual) pos0 = S + (1 - from0); pos1 = S + (1 - from1); } + second_loop_entry = false; S0++; } S1++; @@ -5444,6 +5447,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) gfc_expr *backexpr; gfc_se backse; tree pos[GFC_MAX_DIMENSIONS]; + tree idx[GFC_MAX_DIMENSIONS]; tree result_var = NULL_TREE; int n; bool optional_mask; @@ -5525,6 +5529,8 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) gfc_get_string ("pos%d", i)); offset[i] = gfc_create_var (gfc_array_index_type, gfc_get_string ("offset%d", i)); + idx[i] = gfc_create_var (gfc_array_index_type, + gfc_get_string ("idx%d", i)); } /* Walk the arguments. */ @@ -5609,6 +5615,18 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) gfc_add_modify (&se->pre, limit, tmp); + /* If we are in a case where we generate two sets of loops, the second one + should continue where the first stopped instead of restarting from the + beginning. So nested loops in the second set should have a partial range + on the first iteration, but they should start from the beginning and span + their full range on the following iterations. So we use conditionals in + the loops lower bounds, and use the following variable in those + conditionals to decide whether to use the original loop bound or to use + the index at which the loop from the first set stopped. */ + tree second_loop_entry = gfc_create_var (logical_type_node, + "second_loop_entry"); + gfc_add_modify (&se->pre, second_loop_entry, logical_false_node); + /* Initialize the scalarizer. */ gfc_init_loopinfo (&loop); @@ -5752,8 +5770,11 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) tmp = fold_build2_loc (input_location, PLUS_EXPR, TREE_TYPE (pos[i]), loop.loopvar[i], offset[i]); gfc_add_modify (&ifblock, pos[i], tmp); + gfc_add_modify (&ifblock, idx[i], loop.loopvar[i]); } + gfc_add_modify (&ifblock, second_loop_entry, logical_true_node); + if (lab1) gfc_add_expr_to_block (&ifblock, build1_v (GOTO_EXPR, lab1)); @@ -5816,6 +5837,12 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) if (lab1) { + for (int i = 0; i < loop.dimen; i++) + loop.from[i] = fold_build3_loc (input_location, COND_EXPR, + TREE_TYPE (loop.from[i]), + second_loop_entry, idx[i], + loop.from[i]); + gfc_trans_scalarized_loop_boundary (&loop, &body); stmtblock_t * const outer_block = &loop.code[loop.dimen - 1]; @@ -5921,7 +5948,9 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * se, gfc_expr * expr, enum tree_code op) } else tmp = gfc_finish_block (&block); + gfc_add_expr_to_block (&body, tmp); + gfc_add_modify (&body, second_loop_entry, logical_false_node); } gfc_trans_scalarizing_loops (&loop, &body);