From patchwork Tue Sep 8 20:19:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julian Brown X-Patchwork-Id: 1360057 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4BmGjT2ypwz9sR4 for ; Wed, 9 Sep 2020 06:20:13 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 553163952535; Tue, 8 Sep 2020 20:20:07 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id 0CB9F385783A for ; Tue, 8 Sep 2020 20:20:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 0CB9F385783A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Julian_Brown@mentor.com IronPort-SDR: VVPaijf1wTZNs9yEUYPRt437BjxGyU6FptSZwxEALun3rcwYxt1z64NGLr00Dokjjjj8lWcY9c Dxqtqp7m+A6bImQxBFUGc435uySbSzOFQql35rbTZebc0sICFmJ4FXmzNbICKV3AVuM70vWRWK xgReN8T1CwtzXfhwgClg63VrZt5CEiLSy4Ehnvxez60D0d4lkfCnZLkBjmZyDcY1lmxSPguoQE l9Rc4ovKY4Bzi0KnCVoXdfSgJuNxeWjSfUQaUqgygnG+AcadJY9ODCil96M7BkG2Q9AS9s4GgS gL4= X-IronPort-AV: E=Sophos;i="5.76,406,1592899200"; d="scan'208";a="54886867" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa1.mentor.iphmx.com with ESMTP; 08 Sep 2020 12:20:04 -0800 IronPort-SDR: dTshTQ0ioHpa9mKft/TTDZt6cjeaycxbYnbNsJYkLccogWD+74wAvYxdqIxXekr91gQmwODXkB yU+msX61cOF2jym0LHIGJK5771ORIiQy4rXNUhXpiloysClVS1sXy1hBayMDTB69Ba/h8MNqVm Sv4H/lWoq7N725Ha/A1QZ2MAGfCevEidr/MT3It8Pu/Y0uRFBbGa7ir63XRZo8eb2lGQ5tjCcv U32Hd3v70ZE4D6O++uxKq1m24+Gf6u9NdmZ8KRkJKwXhYX2xWO8wW2Piskx4kGD84ChOUy66k8 tf8= From: Julian Brown To: Subject: [PATCH] openacc: Fix mkoffload SGPR/VGPR count parsing for HSACO v3 Date: Tue, 8 Sep 2020 13:19:45 -0700 Message-ID: <20200908201947.43277-2-julian@codesourcery.com> X-Mailer: git-send-email 2.28.0 MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-06.mgc.mentorg.com (139.181.222.6) To SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_NUMSUBJECT, SPF_HELO_PASS, SPF_PASS, TXREP, T_FILL_THIS_FORM_SHORT autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ams@codesourcery.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" If an offload kernel uses a large number of VGPRs, AMD GCN hardware may need to limit the number of threads/workers launched for that kernel. The number of SGPRs/VGPRs in use is detected by mkoffload and recorded in the processed output. The patterns emitted detailing SGPR/VGPR occupancy changed between HSACO v2 and v3 though, so this patch updates parsing to account for that. Tested with offloading to AMD GCN. I will apply shortly. Julian 2020-09-08 Julian Brown gcc/ * config/gcn/mkoffload.c (process_asm): Initialise regcount. Update scanning for SGPR/VGPR usage for HSACO v3. --- gcc/config/gcn/mkoffload.c | 40 ++++++++++++++++++++++++-------------- 1 file changed, 25 insertions(+), 15 deletions(-) diff --git a/gcc/config/gcn/mkoffload.c b/gcc/config/gcn/mkoffload.c index 808ce53176c..0983b98e178 100644 --- a/gcc/config/gcn/mkoffload.c +++ b/gcc/config/gcn/mkoffload.c @@ -432,7 +432,7 @@ process_asm (FILE *in, FILE *out, FILE *cfile) int sgpr_count; int vgpr_count; char *kernel_name; - } regcount; + } regcount = { -1, -1, NULL }; /* Always add _init_array and _fini_array as kernels. */ obstack_ptr_grow (&fns_os, xstrdup ("_init_array")); @@ -440,7 +440,12 @@ process_asm (FILE *in, FILE *out, FILE *cfile) fn_count += 2; char buf[1000]; - enum { IN_CODE, IN_AMD_KERNEL_CODE_T, IN_VARS, IN_FUNCS } state = IN_CODE; + enum + { IN_CODE, + IN_METADATA, + IN_VARS, + IN_FUNCS + } state = IN_CODE; while (fgets (buf, sizeof (buf), in)) { switch (state) @@ -453,21 +458,25 @@ process_asm (FILE *in, FILE *out, FILE *cfile) obstack_grow (&dims_os, &dim, sizeof (dim)); dims_count++; } - else if (sscanf (buf, " .amdgpu_hsa_kernel %ms\n", - ®count.kernel_name) == 1) - break; break; } - case IN_AMD_KERNEL_CODE_T: + case IN_METADATA: { - gcc_assert (regcount.kernel_name); - if (sscanf (buf, " wavefront_sgpr_count = %d\n", - ®count.sgpr_count) == 1) + if (sscanf (buf, " - .name: %ms\n", ®count.kernel_name) == 1) break; - else if (sscanf (buf, " workitem_vgpr_count = %d\n", + else if (sscanf (buf, " .sgpr_count: %d\n", + ®count.sgpr_count) == 1) + { + gcc_assert (regcount.kernel_name); + break; + } + else if (sscanf (buf, " .vgpr_count: %d\n", ®count.vgpr_count) == 1) - break; + { + gcc_assert (regcount.kernel_name); + break; + } break; } @@ -508,9 +517,10 @@ process_asm (FILE *in, FILE *out, FILE *cfile) state = IN_VARS; else if (sscanf (buf, " .section .gnu.offload_funcs%c", &dummy) > 0) state = IN_FUNCS; - else if (sscanf (buf, " .amd_kernel_code_%c", &dummy) > 0) + else if (sscanf (buf, " .amdgpu_metadata%c", &dummy) > 0) { - state = IN_AMD_KERNEL_CODE_T; + state = IN_METADATA; + regcount.kernel_name = NULL; regcount.sgpr_count = regcount.vgpr_count = -1; } else if (sscanf (buf, " .section %c", &dummy) > 0 @@ -519,7 +529,7 @@ process_asm (FILE *in, FILE *out, FILE *cfile) || sscanf (buf, " .data%c", &dummy) > 0 || sscanf (buf, " .ident %c", &dummy) > 0) state = IN_CODE; - else if (sscanf (buf, " .end_amd_kernel_code_%c", &dummy) > 0) + else if (sscanf (buf, " .end_amdgpu_metadata%c", &dummy) > 0) { state = IN_CODE; gcc_assert (regcount.kernel_name != NULL @@ -531,7 +541,7 @@ process_asm (FILE *in, FILE *out, FILE *cfile) regcount.sgpr_count = regcount.vgpr_count = -1; } - if (state == IN_CODE || state == IN_AMD_KERNEL_CODE_T) + if (state == IN_CODE || state == IN_METADATA) fputs (buf, out); }