From patchwork Mon Dec 7 11:19:08 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin Jambor X-Patchwork-Id: 553342 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 94A761402A0 for ; Mon, 7 Dec 2015 22:19:23 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=vMY1jjV8; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; q=dns; s=default; b=UOadGtN5xejOE9Bc9 UG1DHipDffuP14GhVwTOdvBtEpH3H8OlmlxyiA3V5a93Vk7qoPLkLOa8h0vxuNSt Icmqw4R/9cLxVivxPd/KIIbJZVHG2vNBG3WCtdUSsQYPcnE89ay6W7WmsBicO7ZC 0C1auP8LSMVsGVov7PvUKVPVL0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=default; bh=AX2zYoGaH8MKGeTep6gW1vF PV6M=; b=vMY1jjV8oRtzsKr01EmXvgPx1ODGQ5nfPVnpsSt1U0NOAHcuU52KBM2 Mpw3mTMbAVcUit+slzFdCW18j5j3pQINpLAXHgDYoTCQU6Gn932C7YNKm/pNk9Sd Ee7NdlrqATmSDrAwxktmLZJj60uZAPqKU5c17cIsOzKF1jmt91ao= Received: (qmail 86637 invoked by alias); 7 Dec 2015 11:19:15 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 86209 invoked by uid 89); 7 Dec 2015 11:19:13 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL, BAYES_50, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mx2.suse.de Received: from mx2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Mon, 07 Dec 2015 11:19:11 +0000 Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 647A9AAC7; Mon, 7 Dec 2015 11:19:08 +0000 (UTC) Date: Mon, 7 Dec 2015 12:19:08 +0100 From: Martin Jambor To: GCC Patches Cc: Jakub Jelinek Subject: [hsa 1/10] Configury changes and new options Message-ID: <20151207111908.GB24234@virgil.suse.cz> Mail-Followup-To: GCC Patches , Jakub Jelinek References: <20151207111758.GA24234@virgil.suse.cz> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20151207111758.GA24234@virgil.suse.cz> User-Agent: Mutt/1.5.24 (2015-08-30) X-IsSubscribed: yes Hi, this patch contains changes to the configuration mechanism and offload bits, so that users can build compilers with HSA support. It plays nicely with other accelerators despite using an altogether different implementation approach. I have also added to it definitions of the new options and parameters, since at least one hunk in common.opt is highly related. -fdisable-hsa-gridification has disappeared, othrwise very little has changed since the last submission. With this patch, the user can request HSA support by including the string "hsa" among the requested accelerators in --enable-offload-targets. This will cause the compiler to start producing HSAIL for target OpenMP constructs/functions and the hsa libgomp plugin to be built. Because the plugin needs to use HSA run-time library, I have introduced options --with-hsa-runtime (and more precise --with-hsa-include and --with-hsa-lib) to help find it. The open-sourced hsa runtime available at github is binary compatible with the closed-source one which however also contains the finalizer and so needs to be used for all practical purposes. I am regularly asking AMD to keep their promise and open source the finalizer too. One catch is however that there is no offload compiler for HSA and so the wrapper should not attempt to look for it (that is what the hunk in lto-wrapper.c does) and when HSA is the only accelerator, it is wasteful to output LTO sections with byte-code and therefore if HSA is the only configured accelerator, it does not set ENABLE_OFFLOADING macro. Finally, when the compiler has been configured for HSA but the user disables it by omitting it in the -foffload compiler option, we need to observe that decision. That is what the opts.c hunk does. As far as the options are concerned, the patch adds new warning -Whsa we emit whenever we fail to produce HSAIL for some source code. It is on by default but warnigs are of course only emitted by HSAIL generating code so will never affect anybody who does not use both an HSA-enabled compiler and OpenMP 4 device constructs. Then there is a new parameter hsa-gen-debug-stores, which will be obsolete once HSA run-time supports debugging traps. Before that, we have to do with debugging stores to memory at defined places, which however can cost speed in benchmarks. So we only enabled them with this parameter. We decided to make it a parameter rather than a switch to emphasize the fact it will go away and to possibly allow us select different levels of verbosity of the stores in the future). Any feedback is very appreciated, Martin 2015-12-04 Martin Jambor gcc/ * Makefile.in (OBJS): Add new source files. (GTFILES): Add hsa.c. * config.in (ENABLE_HSA): New. * configure.ac: Treat hsa differently from other accelerators. (OFFLOAD_TARGETS): Define ENABLE_OFFLOADING according to $enable_offloading. (ENABLE_HSA): Define ENABLE_HSA according to $enable_hsa. * doc/install.texi (Configuration): Document --with-hsa-runtime, --with-hsa-runtime-include and --with-hsa-runtime-lib. * lto-wrapper.c (compile_images_for_offload_targets): Do not attempt to invoke offload compiler for hsa acclerator. * opts.c (common_handle_option): Determine whether HSA offloading should be performed. * common.opt (disable_hsa): New variable. (-Whsa): New warning. * doc/invoke.texi (-Whsa): Document. (hsa-gen-debug-stores): Likewise. * params.def (PARAM_HSA_GEN_DEBUG_STORES): New parameter. libgomp/plugin/ * Makefrag.am: Add HSA plugin requirements. * configfrag.ac (HSA_RUNTIME_INCLUDE): New variable. (HSA_RUNTIME_LIB): Likewise. (HSA_RUNTIME_CPPFLAGS): Likewise. (HSA_RUNTIME_INCLUDE): New substitution. (HSA_RUNTIME_LIB): Likewise. (HSA_RUNTIME_LDFLAGS): Likewise. (hsa-runtime): New configure option. (hsa-runtime-include): Likewise. (hsa-runtime-lib): Likewise. (PLUGIN_HSA): New substitution variable. Fill HSA_RUNTIME_INCLUDE and HSA_RUNTIME_LIB according to the new configure options. (PLUGIN_HSA_CPPFLAGS): Likewise. (PLUGIN_HSA_LDFLAGS): Likewise. (PLUGIN_HSA_LIBS): Likewise. Check that we have access to HSA run-time. diff --git a/gcc/Makefile.in b/gcc/Makefile.in index bee2879..5fe73a7 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1296,6 +1296,11 @@ OBJS = \ graphite-sese-to-poly.o \ gtype-desc.o \ haifa-sched.o \ + hsa.o \ + hsa-gen.o \ + hsa-regalloc.o \ + hsa-brig.o \ + hsa-dump.o \ hw-doloop.o \ hwint.o \ ifcvt.o \ @@ -1320,6 +1325,7 @@ OBJS = \ ipa-icf.o \ ipa-icf-gimple.o \ ipa-reference.o \ + ipa-hsa.o \ ipa-ref.o \ ipa-utils.o \ ipa.o \ @@ -2401,6 +2407,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \ $(srcdir)/sanopt.c \ $(srcdir)/ipa-devirt.c \ $(srcdir)/internal-fn.h \ + $(srcdir)/hsa.c \ @all_gtfiles@ # Compute the list of GT header files from the corresponding C sources, diff --git a/gcc/config.in b/gcc/config.in index bb0d220..eee9f60 100644 --- a/gcc/config.in +++ b/gcc/config.in @@ -144,6 +144,12 @@ #endif +/* Define this to enable support for generating HSAIL. */ +#ifndef USED_FOR_TARGET +#undef ENABLE_HSA +#endif + + /* Define if gcc should always pass --build-id to linker. */ #ifndef USED_FOR_TARGET #undef ENABLE_LD_BUILDID diff --git a/gcc/configure.ac b/gcc/configure.ac index 5990b7c..5ce1881 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -940,6 +940,13 @@ AC_SUBST(accel_dir_suffix) for tgt in `echo $enable_offload_targets | sed 's/,/ /g'`; do tgt=`echo $tgt | sed 's/=.*//'` + + if echo "$tgt" | grep "^hsa" > /dev/null ; then + enable_hsa=1 + else + enable_offloading=1 + fi + if test x"$offload_targets" = x; then offload_targets=$tgt else @@ -948,7 +955,7 @@ for tgt in `echo $enable_offload_targets | sed 's/,/ /g'`; do done AC_DEFINE_UNQUOTED(OFFLOAD_TARGETS, "$offload_targets", [Define to offload targets, separated by commas.]) -if test x"$offload_targets" != x; then +if test x"$enable_offloading" != x; then AC_DEFINE(ENABLE_OFFLOADING, 1, [Define this to enable support for offloading.]) else @@ -956,6 +963,11 @@ else [Define this to enable support for offloading.]) fi +if test x"$enable_hsa" = x1 ; then + AC_DEFINE(ENABLE_HSA, 1, + [Define this to enable support for generating HSAIL.]) +fi + AC_ARG_WITH(multilib-list, [AS_HELP_STRING([--with-multilib-list], [select multilibs (AArch64, SH and x86-64 only)])], :, diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi index 0b71bef..232586d 100644 --- a/gcc/doc/install.texi +++ b/gcc/doc/install.texi @@ -1982,6 +1982,23 @@ specifying paths @var{path1}, @dots{}, @var{pathN}. % @var{srcdir}/configure \ --enable-offload-target=i686-unknown-linux-gnu=/path/to/i686/compiler,x86_64-pc-linux-gnu @end smallexample + +If @samp{hsa} is specified as one of the targets, the compiler will be +built with support for HSA GPU accelerators. Because the same +compiler will emit the accelerator code, no path should be specified. + +@item --with-hsa-runtime=@var{pathname} +@itemx --with-hsa-runtime-include=@var{pathname} +@itemx --with-hsa-runtime-lib=@var{pathname} + +If you configure GCC with HSA offloading but do not have the HSA +run-time library installed in a standard location then you can +explicitely specify the directory where they are installed. The +@option{--with-hsa-runtime=@/@var{hsainstalldir}} option is a +shorthand for +@option{--with-hsa-runtime-lib=@/@var{hsainstalldir}/lib} and +@option{--with-hsa-runtime-include=@/@var{hsainstalldir}/include}. + @end table @subheading Cross-Compiler-Specific Options diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c index e4772d1..5609207 100644 --- a/gcc/lto-wrapper.c +++ b/gcc/lto-wrapper.c @@ -745,6 +745,11 @@ compile_images_for_offload_targets (unsigned in_argc, char *in_argv[], offload_names = XCNEWVEC (char *, num_targets + 1); for (unsigned i = 0; i < num_targets; i++) { + /* HSA does not use LTO-like streaming and a different compiler, skip + it. */ + if (strncmp(names[i], "hsa", 3) == 0) + continue; + offload_names[i] = compile_offload_image (names[i], compiler_path, in_argc, in_argv, compiler_opts, compiler_opt_count, diff --git a/gcc/opts.c b/gcc/opts.c index 874c84f..5647f0c 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -1906,8 +1906,35 @@ common_handle_option (struct gcc_options *opts, break; case OPT_foffload_: - /* Deferred. */ - break; + { + const char *p = arg; + opts->x_flag_disable_hsa = true; + while (*p != 0) + { + const char *comma = strchr (p, ','); + + if ((strncmp (p, "disable", 7) == 0) + && (p[7] == ',' || p[7] == '\0')) + { + opts->x_flag_disable_hsa = true; + break; + } + + if ((strncmp (p, "hsa", 3) == 0) + && (p[3] == ',' || p[3] == '\0')) + { +#ifdef ENABLE_HSA + opts->x_flag_disable_hsa = false; +#else + sorry ("HSA has not been enabled during configuration"); +#endif + } + if (!comma) + break; + p = comma + 1; + } + break; + } #ifndef ACCEL_COMPILER case OPT_foffload_abi_: diff --git a/libgomp/plugin/Makefrag.am b/libgomp/plugin/Makefrag.am index 745becd..433bba1 100644 --- a/libgomp/plugin/Makefrag.am +++ b/libgomp/plugin/Makefrag.am @@ -38,3 +38,16 @@ libgomp_plugin_nvptx_la_LDFLAGS += $(PLUGIN_NVPTX_LDFLAGS) libgomp_plugin_nvptx_la_LIBADD = libgomp.la $(PLUGIN_NVPTX_LIBS) libgomp_plugin_nvptx_la_LIBTOOLFLAGS = --tag=disable-static endif + +if PLUGIN_HSA +# Heterogenous Systems Architecture plugin +libgomp_plugin_hsa_version_info = -version-info $(libtool_VERSION) +toolexeclib_LTLIBRARIES += libgomp-plugin-hsa.la +libgomp_plugin_hsa_la_SOURCES = plugin/plugin-hsa.c +libgomp_plugin_hsa_la_CPPFLAGS = $(AM_CPPFLAGS) $(PLUGIN_HSA_CPPFLAGS) +libgomp_plugin_hsa_la_LDFLAGS = $(libgomp_plugin_hsa_version_info) \ + $(lt_host_flags) +libgomp_plugin_hsa_la_LDFLAGS += $(PLUGIN_HSA_LDFLAGS) +libgomp_plugin_hsa_la_LIBADD = libgomp.la $(PLUGIN_HSA_LIBS) +libgomp_plugin_hsa_la_LIBTOOLFLAGS = --tag=disable-static +endif diff --git a/libgomp/plugin/configfrag.ac b/libgomp/plugin/configfrag.ac index ad70dd1..c50e5cb 100644 --- a/libgomp/plugin/configfrag.ac +++ b/libgomp/plugin/configfrag.ac @@ -81,6 +81,54 @@ AC_SUBST(PLUGIN_NVPTX_CPPFLAGS) AC_SUBST(PLUGIN_NVPTX_LDFLAGS) AC_SUBST(PLUGIN_NVPTX_LIBS) +# Look for HSA run-time, its includes and libraries + +HSA_RUNTIME_INCLUDE= +HSA_RUNTIME_LIB= +AC_SUBST(HSA_RUNTIME_INCLUDE) +AC_SUBST(HSA_RUNTIME_LIB) +HSA_RUNTIME_CPPFLAGS= +HSA_RUNTIME_LDFLAGS= + +AC_ARG_WITH(hsa-runtime, + [AS_HELP_STRING([--with-hsa-runtime=PATH], + [specify prefix directory for installed HSA run-time package. + Equivalent to --with-hsa-runtime-include=PATH/include + plus --with-hsa-runtime-lib=PATH/lib])]) +AC_ARG_WITH(hsa-runtime-include, + [AS_HELP_STRING([--with-hsa-runtime-include=PATH], + [specify directory for installed HSA run-time include files])]) +AC_ARG_WITH(hsa-runtime-lib, + [AS_HELP_STRING([--with-hsa-runtime-lib=PATH], + [specify directory for the installed HSA run-time library])]) +if test "x$with_hsa_runtime" != x; then + HSA_RUNTIME_INCLUDE=$with_hsa_runtime/include + HSA_RUNTIME_LIB=$with_hsa_runtime/lib +fi +if test "x$with_hsa_runtime_include" != x; then + HSA_RUNTIME_INCLUDE=$with_hsa_runtime_include +fi +if test "x$with_hsa_runtime_lib" != x; then + HSA_RUNTIME_LIB=$with_hsa_runtime_lib +fi +if test "x$HSA_RUNTIME_INCLUDE" != x; then + HSA_RUNTIME_CPPFLAGS=-I$HSA_RUNTIME_INCLUDE +fi +if test "x$HSA_RUNTIME_LIB" != x; then + HSA_RUNTIME_LDFLAGS=-L$HSA_RUNTIME_LIB +fi + +PLUGIN_HSA=0 +PLUGIN_HSA_CPPFLAGS= +PLUGIN_HSA_LDFLAGS= +PLUGIN_HSA_LIBS= +AC_SUBST(PLUGIN_HSA) +AC_SUBST(PLUGIN_HSA_CPPFLAGS) +AC_SUBST(PLUGIN_HSA_LDFLAGS) +AC_SUBST(PLUGIN_HSA_LIBS) + + + # Get offload targets and path to install tree of offloading compiler. offload_additional_options= offload_additional_lib_paths= @@ -122,6 +170,49 @@ if test x"$enable_offload_targets" != x; then ;; esac ;; + hsa*) + case "${target}" in + x86_64-*-*) + case " ${CC} ${CFLAGS} " in + *" -m32 "*) + PLUGIN_HSA=0 + ;; + *) + tgt_name=hsa + PLUGIN_HSA=$tgt + PLUGIN_HSA_CPPFLAGS=$HSA_RUNTIME_CPPFLAGS + PLUGIN_HSA_LDFLAGS=$HSA_RUNTIME_LDFLAGS + PLUGIN_HSA_LIBS="-lhsa-runtime64 -lhsakmt" + + PLUGIN_HSA_save_CPPFLAGS=$CPPFLAGS + CPPFLAGS="$PLUGIN_HSA_CPPFLAGS $CPPFLAGS" + PLUGIN_HSA_save_LDFLAGS=$LDFLAGS + LDFLAGS="$PLUGIN_HSA_LDFLAGS $LDFLAGS" + PLUGIN_HSA_save_LIBS=$LIBS + LIBS="$PLUGIN_HSA_LIBS $LIBS" + + AC_LINK_IFELSE( + [AC_LANG_PROGRAM( + [#include "hsa.h"], + [hsa_status_t status = hsa_init ()])], + [PLUGIN_HSA=1]) + CPPFLAGS=$PLUGIN_HSA_save_CPPFLAGS + LDFLAGS=$PLUGIN_HSA_save_LDFLAGS + LIBS=$PLUGIN_HSA_save_LIBS + case $PLUGIN_HSA in + hsa*) + HSA_PLUGIN=0 + AC_MSG_ERROR([HSA run-time package required for HSA support]) + ;; + esac + ;; + esac + ;; + *-*-*) + PLUGIN_HSA=0 + ;; + esac + ;; *) AC_MSG_ERROR([unknown offload target specified]) ;; @@ -145,3 +236,6 @@ AC_DEFINE_UNQUOTED(OFFLOAD_TARGETS, "$offload_targets", AM_CONDITIONAL([PLUGIN_NVPTX], [test $PLUGIN_NVPTX = 1]) AC_DEFINE_UNQUOTED([PLUGIN_NVPTX], [$PLUGIN_NVPTX], [Define to 1 if the NVIDIA plugin is built, 0 if not.]) +AM_CONDITIONAL([PLUGIN_HSA], [test $PLUGIN_HSA = 1]) +AC_DEFINE_UNQUOTED([PLUGIN_HSA], [$PLUGIN_HSA], + [Define to 1 if the HSA plugin is built, 0 if not.]) diff --git a/gcc/common.opt b/gcc/common.opt index e1617c4..1669441 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -229,6 +229,10 @@ unsigned int flag_sanitize_recover = SANITIZE_UNDEFINED | SANITIZE_NONDEFAULT | Variable bool dump_base_name_prefixed = false +; Flag whether HSA generation has been explicitely disabled +Variable +bool flag_disable_hsa = false + ### Driver @@ -583,6 +587,10 @@ Wfree-nonheap-object Common Var(warn_free_nonheap_object) Init(1) Warning Warn when attempting to free a non-heap object. +Whsa +Common Var(warn_hsa) Init(1) Warning +Warn when a function cannot be expanded to HSAIL. + Winline Common Var(warn_inline) Warning Warn when an inlined function cannot be inlined. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 586f11f..37a07bc 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -295,7 +295,7 @@ Objective-C and Objective-C++ Dialects}. -Wunused-but-set-parameter -Wunused-but-set-variable @gol -Wuseless-cast -Wvariadic-macros -Wvector-operation-performance @gol -Wvla -Wvolatile-register-var -Wwrite-strings @gol --Wzero-as-null-pointer-constant} +-Wzero-as-null-pointer-constant -Whsa} @item C and Objective-C-only Warning Options @gccoptlist{-Wbad-function-cast -Wmissing-declarations @gol @@ -5685,6 +5685,10 @@ Suppress warnings when a positional initializer is used to initialize a structure that has been marked with the @code{designated_init} attribute. +@item -Whsa +Issue a warning when HSAIL cannot be emitted for the compiled function or +OpenMP construct. + @end table @node Debugging Options @@ -11210,6 +11214,12 @@ dynamic, guided, auto, runtime). The default is static. Maximum depth of recursion when querying properties of SSA names in things like fold routines. One level of recursion corresponds to following a use-def chain. + +@item hsa-gen-debug-stores +Enable emission of special debug stores within HSA kernels which are +then read and reported by libgomp plugin. Generation of these stores +is disabled by default, use @option{--param hsa-gen-debug-stores=1} to +enable it. @end table @end table diff --git a/gcc/params.def b/gcc/params.def index 41fd8a8..411d72c 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -1177,6 +1177,11 @@ DEFPARAM (PARAM_MAX_SSA_NAME_QUERY_DEPTH, "Maximum recursion depth allowed when querying a property of an" " SSA name.", 2, 1, 0) + +DEFPARAM (PARAM_HSA_GEN_DEBUG_STORES, + "hsa-gen-debug-stores", + "Level of hsa debug stores verbosity", + 0, 0, 1) /* Local variables: