From patchwork Thu Sep 10 12:26:03 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nathan Sidwell X-Patchwork-Id: 516250 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id C689B140157 for ; Thu, 10 Sep 2015 22:26:14 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=HOoTfrTg; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=cTOCKDAlTKhG3BU0PwGqSGmu6+Wlo1u18rsQNB0JlqKGpZcVHl 7wO0f2OfFeWwtlCn2h1/F409749iqIuYrMznMcud6ymK58dcceE3o4tRY+YZcMhI OXIIpYHOAFf09EgLqBWbq7NEiSTMSzt87ctyLSE1IzU1Fxlwxc0onCH+g= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=NCAglsnUGx5OOb6y4SAQWWzasc4=; b=HOoTfrTgv2ZQCuq1/8h/ kH4fwACV+loe3pK4WJNxgrttusZcS9Dr4VLU4J46GsN6lPU3/OtBP4zQpZTQxvad n2gsZzXBpGH/o6fiNf4Hpe/uGlwKMiWKt3B4dNtFrLZvsmJaUs+S61Y2N475KOz9 ifCeg2VpXdFhZ6Om9ByAX+Y= Received: (qmail 122535 invoked by alias); 10 Sep 2015 12:26:08 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 122510 invoked by uid 89); 10 Sep 2015 12:26:07 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.1 required=5.0 tests=BAYES_40, FREEMAIL_FROM, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-qk0-f173.google.com Received: from mail-qk0-f173.google.com (HELO mail-qk0-f173.google.com) (209.85.220.173) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Thu, 10 Sep 2015 12:26:06 +0000 Received: by qkcf65 with SMTP id f65so17213916qkc.3 for ; Thu, 10 Sep 2015 05:26:04 -0700 (PDT) X-Received: by 10.55.215.21 with SMTP id m21mr52226861qki.98.1441887964522; Thu, 10 Sep 2015 05:26:04 -0700 (PDT) Received: from ?IPv6:2601:181:c000:c497:a2a8:cdff:fe3e:b48? ([2601:181:c000:c497:a2a8:cdff:fe3e:b48]) by smtp.googlemail.com with ESMTPSA id o199sm5822963qhb.25.2015.09.10.05.26.03 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 10 Sep 2015 05:26:03 -0700 (PDT) To: GCC Patches From: Nathan Sidwell Subject: [gomp4] predicate register caching Message-ID: <55F176DB.7040004@acm.org> Date: Thu, 10 Sep 2015 08:26:03 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 I've committed this to gomp4. Rather than recalculate the 'not lane 0' predicate on each use, we calculate it at the top of the function and use throughout. This appears to be the recommended approach. nathan 2015-09-10 Nathan Sidwell * config/nvptx/nvptx.c (nvptx_init_axis_predicate): New. (nvptx_declare_function_name): Initialize axis predicates. (nvptx_single): Use or init machine_function axis predicate. * config/nvptx/nvptx.h (struct machine_function): Add axis_predicate field. Index: gcc/config/nvptx/nvptx.c =================================================================== --- gcc/config/nvptx/nvptx.c (revision 227632) +++ gcc/config/nvptx/nvptx.c (working copy) @@ -603,6 +603,20 @@ nvptx_record_needed_fndecl (tree decl) *slot = decl; } +/* Emit code to initialize the REGNO predicate register to indicate + whether we are not lane zero on the NAME axis. */ + +static void +nvptx_init_axis_predicate (FILE *file, int regno, const char *name) +{ + fprintf (file, "\t{\n"); + + fprintf (file, "\t.reg.u32\t%%%s;\n", name); + fprintf (file, "\t\tmov.u32\t%%%s, %%tid.%s;\n", name, name); + fprintf (file, "\t\tsetp.ne.u32\t%%r%d, %%%s, 0;\n", regno, name); + fprintf (file, "\t}\n"); +} + /* Implement ASM_DECLARE_FUNCTION_NAME. Writes the start of a ptx function, including local var decls and copies from the arguments to local regs. */ @@ -727,6 +741,14 @@ nvptx_declare_function_name (FILE *file, if (stdarg_p (fntype)) fprintf (file, "\tld.param.u%d %%argp, [%%in_argp];\n", GET_MODE_BITSIZE (Pmode)); + + /* Emit axis predicates. */ + if (cfun->machine->axis_predicate[0]) + nvptx_init_axis_predicate (file, + REGNO (cfun->machine->axis_predicate[0]), "y"); + if (cfun->machine->axis_predicate[1]) + nvptx_init_axis_predicate (file, + REGNO (cfun->machine->axis_predicate[1]), "x"); } /* Output a return instruction. Also copy the return value to its outgoing @@ -2958,13 +2980,15 @@ nvptx_single (unsigned mask, basic_block for (mode = GOMP_DIM_WORKER; mode <= GOMP_DIM_VECTOR; mode++) if (GOMP_DIM_MASK (mode) & skip_mask) { - rtx id = gen_reg_rtx (SImode); - rtx pred = gen_reg_rtx (BImode); rtx_code_label *label = gen_label_rtx (); + rtx pred = cfun->machine->axis_predicate[mode - GOMP_DIM_WORKER]; - emit_insn_before (gen_oacc_dim_pos (id, GEN_INT (mode)), head); - rtx cond = gen_rtx_SET (pred, gen_rtx_NE (BImode, id, const0_rtx)); - emit_insn_before (cond, head); + if (!pred) + { + pred = gen_reg_rtx (BImode); + cfun->machine->axis_predicate[mode - GOMP_DIM_WORKER] = pred; + } + rtx br; if (mode == GOMP_DIM_VECTOR) br = gen_br_true (pred, label); Index: gcc/config/nvptx/nvptx.h =================================================================== --- gcc/config/nvptx/nvptx.h (revision 227632) +++ gcc/config/nvptx/nvptx.h (working copy) @@ -238,6 +238,7 @@ struct GTY(()) machine_function HOST_WIDE_INT outgoing_stdarg_size; int ret_reg_mode; int punning_buffer_size; + rtx axis_predicate[2]; }; #endif