From patchwork Mon Sep 2 19:13:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tobias Burnus X-Patchwork-Id: 1979739 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=baylibre-com.20230601.gappssmtp.com header.i=@baylibre-com.20230601.gappssmtp.com header.a=rsa-sha256 header.s=20230601 header.b=Xd8rNG7m; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WyJMS2s92z1ygs for ; Tue, 3 Sep 2024 05:13:56 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 961BC385EC15 for ; Mon, 2 Sep 2024 19:13:54 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) by sourceware.org (Postfix) with ESMTPS id E26B73858C66 for ; Mon, 2 Sep 2024 19:13:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E26B73858C66 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=baylibre.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=baylibre.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E26B73858C66 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::330 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725304416; cv=none; b=EEEacnL6x8trIDRacJYty6VnBgmro14E/r/mR2nFEQxd0DE/dmQYJQimUyJ/YiagvvtqglzoCEH8x7z48MejzLlT0jxVLigWF99aJggi5XOSrA4dKEZllrSNLyZKPmHaKYL7Zz/r3UzzoFzR4f46JUZxx60dcCEW4AHqV0QFDzs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725304416; c=relaxed/simple; bh=fa0MyWSraoUY55yT94RYyA212TF9wtFF8ntUQrt/Siw=; h=DKIM-Signature:Message-ID:Date:MIME-Version:To:From:Subject; b=GOejgbvPtbVEqkjE0kZk77HhhYS6qfOmnbStSoX+HKcbXIDXW0egx4nxXe2ZVJ4rW3rkGBFKyrowq/6fUHpUS/BPAhoAZw96Q+l8oQbxoL/HGZOcohB8+eZZXmvoRH7khJFoZPaLJ2+gkODfjOBVnm9yhnN5zdyc+W1qcoWnWyk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wm1-x330.google.com with SMTP id 5b1f17b1804b1-428e0d184b4so37984035e9.2 for ; Mon, 02 Sep 2024 12:13:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=baylibre-com.20230601.gappssmtp.com; s=20230601; t=1725304411; x=1725909211; darn=gcc.gnu.org; h=subject:from:to:content-language:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=6JM0in65Jl6tBECGuSuXxkEXY9N8vKjaxB6aOBeuhFM=; b=Xd8rNG7mNm1CKGrerf7HHx41kB0VJHdCOS78YoevmPlPQo9s43nz5N7GKIOvxDK1VJ b50z/+mf9nSGMaC5+CtKKyAkgzIAiAw78apsAK82gVmCrW7emDUv7EIdLvgdcndGaoFX YVZfbef2oFYYVLt11nkcuT+8Hy9J6YseXKgw4LxR0FcBsnhk3mk62ad7DtsQx14IU0BA hEPPtzgu1jcMa7GZug3XPHARrCF+G/AT3FcViRNLNf7ix253gOCT/SVZ2HXbVTuH82Sd wjp1c5bCcUPp9jFPzgZ8Df+75YBqtJCYxGTxVYc1zcAw0oOxV3VeGxh7VRDspBPcSB9z Rn1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725304411; x=1725909211; h=subject:from:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=6JM0in65Jl6tBECGuSuXxkEXY9N8vKjaxB6aOBeuhFM=; b=vB2LNFlkFfXbLiLkBB53eTmlTU2Z813eWnKbpZHdx654tVl0T96zTKlgzavujbw8Kb rTLDCVtu5c0fHmCZ0i31h0a1b96XwJD+oZY/mUG6X5npcYXcXFQeziEMU8fTCNL63dRQ ASXJngOnQgC9qE5qcHrhrPMBAgMWLXDaRBDbFJte43kvGm5C9jrJ6/vvHG7OBQYxkStN bFqc0SsqhQk6Rp22ZrUVVgmdU6hvkK1cLXN8b0TxYWAu6t/Bq1FgEMGhqmGqfoGNbvsr Uif4Op7Ag2QjUdjiNxc1eaw94/x+xcXdGEiazkgMDjCzhanslrVKwXbNJ7mjn9+1JxNw WzmQ== X-Forwarded-Encrypted: i=1; AJvYcCUay0aekGx1FDhcnv3+oiEBK0vWAEDFHYkjgwX6oDOGDjGkwCSpFC5RTA+HjL9Nf8a4fxAcm1zZ3HR9LA==@gcc.gnu.org X-Gm-Message-State: AOJu0Ywa9GluXp7qygA5TXxgfmKAhCTvQtvhPlkkY8S4QtlGwyE6rLZt 0BSSR7I2hKPwmbDwSlinZcNoTbr7Bepb5rHeS8vQ0sWVK7u2pIiQrcMEQePHpkM= X-Google-Smtp-Source: AGHT+IEcdWQ3Jz4jpqeMRoMDchms1196q3X2Z64vNMADUHHZ1ynF1Gxf65hCg/DgO4uekNgdYlO2ew== X-Received: by 2002:a05:600c:1c95:b0:426:64a2:5362 with SMTP id 5b1f17b1804b1-42c7b59e7dfmr50787825e9.8.1725304411345; Mon, 02 Sep 2024 12:13:31 -0700 (PDT) Received: from ?IPV6:2001:16b8:262a:1b00:1f5c:9010:8121:dd9c? (200116b8262a1b001f5c90108121dd9c.dip.versatel-1u1.de. [2001:16b8:262a:1b00:1f5c:9010:8121:dd9c]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-42bb6e274b6sm147069935e9.33.2024.09.02.12.13.29 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 02 Sep 2024 12:13:30 -0700 (PDT) Message-ID: <98e3a21b-90ac-4fcf-9d44-62822adaef0d@baylibre.com> Date: Mon, 2 Sep 2024 21:13:29 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: Jakub Jelinek , Thomas Schwinge , gcc-patches From: Tobias Burnus Subject: [patch] config/nvptx: Handle downward compat for OpenMP context selector X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org For x86-64, the context selector matching is are currently based on features. That's obvious for 'SSE2' where any system offering SSE2 matches, but that also the case for, e.g. a selector asking for 'i486' – which matches when compiling for 'i486', 'i586' and 'i686'. That has pro and cons. Assume compiling for 'i686': If there is a context selector asking for ISA 'i486' we want to use it as i686 supports it – and not, e.g., the generic fallback. — On the other hand, if there are two variants, one for 'i686' and one for 'i486', we want to use the 'i686' variant if the hardware supports it. [I am not sure how to handle this best.] * * * The attached patch does now likewise for nvptx, where the compute capabilities are downward compatible with one exception → https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#ptx-module-directives-target "In general, generations of SM architectures follow an onion layer model, where each generation adds new features and retains all features of previous generations. The onion layer model allows the PTX code generated for a given target to be run on later generation devices. Target architectures with suffix “a”, such as sm_90a, include architecture-accelerated features that are supported on the specified architecture only, hence such targets do not follow the onion layer model. Therefore, PTX code generated for such targets cannot be run on later generation devices. Architecture-accelerated features can only be used with targets that support these features." * * * The patch additionally updates the documentation. Comments, suggestions, approval, disapproval? Tobias PS: I wonder whether it wouldn't make sense to permit all sm_ values with -march=, even if some produce the same binaries (at least for now) vs. supporting only some with -march= and using -march-map= to handle all values. But that's independent of this RFC patch. config/nvptx: Handle downward compat for OpenMP context selector Nvptx's compute capabilities (SM_XX) are downward compatible, i.e. SM_80 supports all features of SM_30, SM_70 etc. Additionally, GCC's -march= currently only supports those values that actually change the generated code - and offers -march=... to map higher values to the next lower supported version. Update libgomp.texi to document the downward compatibility and case sensitivity of the context selectors. gcc/ChangeLog: * config/nvptx/nvptx-sm.def (NVPTX_SM_COMPAT): Add compute capabilities supported by -march-map= lower than sm_80 (= highest supported -march=). * config/nvptx/gen-omp-device-properties.sh: Hande it. * config/nvptx/gen-h.sh: Ignore it. * config/nvptx/gen-multilib-matches.sh: Likewise. * config/nvptx/gen-opt.sh: Likewise. * config/nvptx/nvptx.cc (sm_version_to_number): New. (nvptx_omp_device_kind_arch_isa): Match when requested ISA (sm_XX) version is lower than actual ISA version. libgomp/ChangeLog: * libgomp.texi (OpenMP Context Selectors): Add note about case sensitivity and downward compatibility. * testsuite/libgomp.c/declare-variant-3.h: Extend to check for downward compatibility. * testsuite/libgomp.c/declare-variant-3-sm30.c: Update. * testsuite/libgomp.c/declare-variant-3-sm35.c: Likewise. * testsuite/libgomp.c/declare-variant-3-sm53.c: Likewise. * testsuite/libgomp.c/declare-variant-3-sm70.c: Likewise. * testsuite/libgomp.c/declare-variant-3-sm75.c: Likewise. * testsuite/libgomp.c/declare-variant-3-sm80.c: Likewise. * testsuite/libgomp.c/declare-variant-3.c: Likewise. gcc/config/nvptx/gen-h.sh | 2 +- gcc/config/nvptx/gen-multilib-matches.sh | 2 +- gcc/config/nvptx/gen-omp-device-properties.sh | 2 +- gcc/config/nvptx/gen-opt.sh | 2 +- gcc/config/nvptx/nvptx-sm.def | 22 +++++++ gcc/config/nvptx/nvptx.cc | 33 ++++++++-- .../testsuite/libgomp.c/declare-variant-3-sm30.c | 3 +- .../testsuite/libgomp.c/declare-variant-3-sm35.c | 3 +- .../testsuite/libgomp.c/declare-variant-3-sm53.c | 3 +- .../testsuite/libgomp.c/declare-variant-3-sm70.c | 3 +- .../testsuite/libgomp.c/declare-variant-3-sm75.c | 3 +- .../testsuite/libgomp.c/declare-variant-3-sm80.c | 1 + libgomp/testsuite/libgomp.c/declare-variant-3.c | 8 ++- libgomp/testsuite/libgomp.c/declare-variant-3.h | 75 ++++++++++++++++++++-- 14 files changed, 140 insertions(+), 22 deletions(-) diff --git a/gcc/config/nvptx/gen-h.sh b/gcc/config/nvptx/gen-h.sh index ea75e127cde..592dd8bebc8 100644 --- a/gcc/config/nvptx/gen-h.sh +++ b/gcc/config/nvptx/gen-h.sh @@ -21,7 +21,7 @@ nvptx_sm_def="$1/nvptx-sm.def" gen_copyright_sh="$1/gen-copyright.sh" -sms=$(grep ^NVPTX_SM $nvptx_sm_def | sed 's/.*(//;s/,.*//') +sms=$(grep '^NVPTX_SM[^_]' $nvptx_sm_def | sed 's/.*(//;s/,.*//') cat <= v) + __builtin_abort (); + __builtin_printf ("Nvptx accelerator: sm_%d\n", v); #else - if (v != 0) + if (v != 0 || w != 0) __builtin_abort (); #endif