From patchwork Wed Sep 30 21:39:35 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nathan Sidwell X-Patchwork-Id: 524629 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 6402E140D68 for ; Thu, 1 Oct 2015 07:39:49 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=n8ypMLo9; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=i41BAaL551B6wx7KX5j8HtlNu7yAUNb2Bzzd7fdjHu98KRRTCs EaItw1M8SKOvANnN8oTfFFR7GRvPvV+n5BszuHVHXk4ti9ti/8gORZXZJn/oLZj9 SIgCgCVU9nmGGLtH/S09VwtYPD23duoHifufItIWQcOB69B+QuL78Q7sM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:message-id:date:mime-version:content-type; s= default; bh=e2PqTA+tOEEK8PM+ju/eZqiMnbc=; b=n8ypMLo9+faUDn/CeGOD JHbQyxB+XGDfMw0AWuqHndnA1QYWIsWcLFg53ipQHykFmXA9QNrMxySrZtmfFDsX ONn9jthf0bEcnMYY0LemQvSAIGAW9uAgpgfqthkXx72vOU18OrCipAHcCdew5CNs VmGEChIoKu2rm9/XrOziuHs= Received: (qmail 116182 invoked by alias); 30 Sep 2015 21:39:42 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 116167 invoked by uid 89); 30 Sep 2015 21:39:41 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.9 required=5.0 tests=BAYES_50, FREEMAIL_FROM, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-qg0-f52.google.com Received: from mail-qg0-f52.google.com (HELO mail-qg0-f52.google.com) (209.85.192.52) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Wed, 30 Sep 2015 21:39:39 +0000 Received: by qgez77 with SMTP id z77so48386990qge.1 for ; Wed, 30 Sep 2015 14:39:37 -0700 (PDT) X-Received: by 10.140.109.6 with SMTP id k6mr7328334qgf.28.1443649177532; Wed, 30 Sep 2015 14:39:37 -0700 (PDT) Received: from ?IPv6:2601:181:c000:c497:a2a8:cdff:fe3e:b48? ([2601:181:c000:c497:a2a8:cdff:fe3e:b48]) by smtp.googlemail.com with ESMTPSA id d66sm1073918qgd.36.2015.09.30.14.39.36 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 Sep 2015 14:39:36 -0700 (PDT) To: GCC Patches Cc: Bernd Schmidt , Thomas Schwinge From: Nathan Sidwell Subject: ptx offload data format Message-ID: <560C5697.2040107@acm.org> Date: Wed, 30 Sep 2015 17:39:35 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 I've merged this patch to trunk. It changes the PTX offload data format to be an array of pointers to strings, preparing the way for the static linking patch that Thomas is working on. For the moment, we retain the automatic linking on of the support functions during PTX JITing. Some of the changes to link_ptx were done by Bernd a while back. No change to the PTX ABI version number, as that just got incremented last week with the launch API change -- it's in a state of flux right now. nathan 2015-09-30 Nathan Sidwell gcc/ * config/nvptx/mkoffload.c (process): Change offload data format. 2015-09-30 Nathan Sidwell Bernd Schmidt libgomp/ * plugin/plugin-nvptx.c (targ_fn_launch): Use GOMP_DIM_MAX. (struct targ_ptx_obj): New. (nvptx_tdata): Move earlier, change data format. (link_ptx): Take targ_ptx_obj ptr and count. Allow multiple objects. (GOMP_OFFLOAD_load_image): Adjust. Index: gcc/config/nvptx/mkoffload.c =================================================================== --- gcc/config/nvptx/mkoffload.c (revision 228242) +++ gcc/config/nvptx/mkoffload.c (working copy) @@ -844,39 +844,53 @@ process (FILE *in, FILE *out) Token *tok = tokenize (input); const char *comma; id_map const *id; + unsigned obj_count = 0; + unsigned ix; do tok = parse_file (tok); while (tok->kind); - fprintf (out, "static const char ptx_code[] = \n"); + fprintf (out, "static const char ptx_code_%u[] = \n", obj_count++); write_stmts (out, rev_stmts (decls)); write_stmts (out, rev_stmts (vars)); write_stmts (out, rev_stmts (fns)); fprintf (out, ";\n\n"); + /* Dump out array of pointers to ptx object strings. */ + fprintf (out, "static const struct ptx_obj {\n" + " const char *code;\n" + " __SIZE_TYPE__ size;\n" + "} ptx_objs[] = {"); + for (comma = "", ix = 0; ix != obj_count; comma = ",", ix++) + fprintf (out, "%s\n\t{ptx_code_%u, sizeof (ptx_code_%u)}", comma, ix, ix); + fprintf (out, "\n};\n\n"); + + /* Dump out variable idents. */ fprintf (out, "static const char *const var_mappings[] = {"); for (comma = "", id = var_ids; id; comma = ",", id = id->next) fprintf (out, "%s\n\t%s", comma, id->ptx_name); fprintf (out, "\n};\n\n"); + /* Dump out function idents. */ fprintf (out, "static const struct nvptx_fn {\n" " const char *name;\n" - " unsigned short dim[3];\n" - "} func_mappings[] = {\n"); + " unsigned short dim[%d];\n" + "} func_mappings[] = {\n", GOMP_DIM_MAX); for (comma = "", id = func_ids; id; comma = ",", id = id->next) fprintf (out, "%s\n\t{%s}", comma, id->ptx_name); fprintf (out, "\n};\n\n"); fprintf (out, "static const struct nvptx_tdata {\n" - " const char *ptx_src;\n" + " const struct ptx_obj *ptx_objs;\n" + " unsigned ptx_num;\n" " const char *const *var_names;\n" - " __SIZE_TYPE__ var_num;\n" + " unsigned var_num;\n" " const struct nvptx_fn *fn_names;\n" - " __SIZE_TYPE__ fn_num;\n" + " unsigned fn_num;\n" "} target_data = {\n" - " ptx_code,\n" + " ptx_objs, sizeof (ptx_objs) / sizeof (ptx_objs[0]),\n" " var_mappings," " sizeof (var_mappings) / sizeof (var_mappings[0]),\n" " func_mappings," Index: libgomp/plugin/plugin-nvptx.c =================================================================== --- libgomp/plugin/plugin-nvptx.c (revision 228265) +++ libgomp/plugin/plugin-nvptx.c (working copy) @@ -224,9 +224,31 @@ map_push (struct ptx_stream *s, int asyn struct targ_fn_launch { const char *fn; - unsigned short dim[3]; + unsigned short dim[GOMP_DIM_MAX]; }; +/* Target PTX object information. */ + +struct targ_ptx_obj +{ + const char *code; + size_t size; +}; + +/* Target data image information. */ + +typedef struct nvptx_tdata +{ + const struct targ_ptx_obj *ptx_objs; + unsigned ptx_num; + + const char *const *var_names; + unsigned var_num; + + const struct targ_fn_launch *fn_descs; + unsigned fn_num; +} nvptx_tdata_t; + /* Descriptor of a loaded function. */ struct targ_fn_descriptor @@ -688,7 +710,8 @@ nvptx_get_num_devices (void) static void -link_ptx (CUmodule *module, const char *ptx_code) +link_ptx (CUmodule *module, const struct targ_ptx_obj *ptx_objs, + unsigned num_objs) { CUjit_option opts[7]; void *optvals[7]; @@ -702,8 +725,6 @@ link_ptx (CUmodule *module, const char * void *linkout; size_t linkoutsize __attribute__ ((unused)); - GOMP_PLUGIN_debug (0, "attempting to load:\n---\n%s\n---\n", ptx_code); - opts[0] = CU_JIT_WALL_TIME; optvals[0] = &elapsed; @@ -758,25 +779,37 @@ link_ptx (CUmodule *module, const char * cuda_error (r)); } - /* cuLinkAddData's 'data' argument erroneously omits the const qualifier. */ - r = cuLinkAddData (linkstate, CU_JIT_INPUT_PTX, (char *)ptx_code, - strlen (ptx_code) + 1, 0, 0, 0, 0); - if (r != CUDA_SUCCESS) + for (; num_objs--; ptx_objs++) { - GOMP_PLUGIN_error ("Link error log %s\n", &elog[0]); - GOMP_PLUGIN_fatal ("cuLinkAddData (ptx_code) error: %s", cuda_error (r)); + /* cuLinkAddData's 'data' argument erroneously omits the const + qualifier. */ + GOMP_PLUGIN_debug (0, "Loading:\n---\n%s\n---\n", ptx_objs->code); + r = cuLinkAddData (linkstate, CU_JIT_INPUT_PTX, (char*)ptx_objs->code, + ptx_objs->size, 0, 0, 0, 0); + if (r != CUDA_SUCCESS) + { + GOMP_PLUGIN_error ("Link error log %s\n", &elog[0]); + GOMP_PLUGIN_fatal ("cuLinkAddData (ptx_code) error: %s", + cuda_error (r)); + } } + GOMP_PLUGIN_debug (0, "Linking\n"); r = cuLinkComplete (linkstate, &linkout, &linkoutsize); - if (r != CUDA_SUCCESS) - GOMP_PLUGIN_fatal ("cuLinkComplete error: %s", cuda_error (r)); GOMP_PLUGIN_debug (0, "Link complete: %fms\n", elapsed); GOMP_PLUGIN_debug (0, "Link log %s\n", &ilog[0]); + if (r != CUDA_SUCCESS) + GOMP_PLUGIN_fatal ("cuLinkComplete error: %s", cuda_error (r)); + r = cuModuleLoadData (module, linkout); if (r != CUDA_SUCCESS) GOMP_PLUGIN_fatal ("cuModuleLoadData error: %s", cuda_error (r)); + + r = cuLinkDestroy (linkstate); + if (r != CUDA_SUCCESS) + GOMP_PLUGIN_fatal ("cuLinkDestory error: %s", cuda_error (r)); } static void @@ -1502,19 +1535,6 @@ GOMP_OFFLOAD_fini_device (int n) pthread_mutex_unlock (&ptx_dev_lock); } -/* Data emitted by mkoffload. */ - -typedef struct nvptx_tdata -{ - const char *ptx_src; - - const char *const *var_names; - size_t var_num; - - const struct targ_fn_launch *fn_descs; - size_t fn_num; -} nvptx_tdata_t; - /* Return the libgomp version number we're compatible with. There is no requirement for cross-version compatibility. */ @@ -1553,7 +1573,7 @@ GOMP_OFFLOAD_load_image (int ord, unsign nvptx_attach_host_thread_to_device (ord); - link_ptx (&module, img_header->ptx_src); + link_ptx (&module, img_header->ptx_objs, img_header->ptx_num); /* The mkoffload utility emits a struct of pointers/integers at the start of each offload image. The array of kernel names and the