From patchwork Sun Nov 2 10:08:57 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Glisse X-Patchwork-Id: 405851 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3F30614003E for ; Sun, 2 Nov 2014 21:09:13 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:in-reply-to:message-id:references :mime-version:content-type; q=dns; s=default; b=trn8ennRcSTPY0rC StE8Txdph2ldJaWdWZ4n2dRGMFMNmshYtG0wehahz6rLfV4bbBvi9FvrUhwpE1hl hvNhl3mmwD3+YnVfl5yUONSBHQD164iXBLaJ37P66JTtdl34GGvjkUtq0T7PIFmq kuZ/tnlLfDFx0Or+VQAALG9N0ws= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:in-reply-to:message-id:references :mime-version:content-type; s=default; bh=qs/9ZpuIdopnBvRiPwRtjo j1EGI=; b=A616RMwGw56BWddN7+8ztw9zm0ob4yGAwCIJISZ8pQkmRZsJYTLYoH YvWIFiDcPNQHAVFA6BejwDh1gDlqzxfjH4uyvPvTfQLgwHPpqprlvFL8f1QvlNA7 XZGqUk/sVGoWd4iqQcKjHnycIUymnBK0vISFHG5G4thorP/VnykU8= Received: (qmail 26775 invoked by alias); 2 Nov 2014 10:09:04 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 26761 invoked by uid 89); 2 Nov 2014 10:09:03 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.3 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mail3-relais-sop.national.inria.fr Received: from mail3-relais-sop.national.inria.fr (HELO mail3-relais-sop.national.inria.fr) (192.134.164.104) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Sun, 02 Nov 2014 10:09:01 +0000 Received: from stedding.saclay.inria.fr (HELO stedding) ([193.55.250.194]) by mail3-relais-sop.national.inria.fr with ESMTP/TLS/AES128-SHA; 02 Nov 2014 11:08:57 +0100 Received: from glisse (helo=localhost) by stedding with local-esmtp (Exim 4.84) (envelope-from ) id 1Xks5h-0003z2-Ce; Sun, 02 Nov 2014 11:08:57 +0100 Date: Sun, 2 Nov 2014 11:08:57 +0100 (CET) From: Marc Glisse To: Richard Biener cc: Jakub Jelinek , gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Fix fold making valid VEC_PERMs invalid In-Reply-To: Message-ID: References: <20141030091318.GZ10376@tucnak.redhat.com> User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 On Thu, 30 Oct 2014, Richard Biener wrote: > On Thu, 30 Oct 2014, Jakub Jelinek wrote: > >> On Thu, Oct 30, 2014 at 09:56:32AM +0100, Richard Biener wrote: >>> >>> The following patch makes fold_ternary no longer make >>> VEC_PERMs valid for the target invalid. As pointed out >>> in the PR we only need to make sure this doesn't happen >>> after vector lowering. >> >> Well, even if you do that before vector lowering, if the original >> mask is fine and new one is not, you'd seriously pessimize code. > > Note that what fold does here isn't very elaborate - it just > tries hard to make a two-input VEC_PERM a one-input one which > is good for canonicalization and CSE. I'd say that the > optabs.c code should ideally be able to recover the working > variant (or the target expanders should be more clever...). It may be hard for optabs.c. The target expanders should really be fixed. This isn't the same as when we were talking of combining any permutations, here it is something that should be fairly easy to do. In that sense, simply avoiding the ICE (at least in release mode) and leaving optimization to the targets should be enough. Still, I am proposing something closer to Jakub's suggestion below. >> How about moving the VEC_PERM_EXPR arg2 == VECTOR_CST folding >> into a separate function with single_arg argument, call it with >> operand_equal_p (op0, op1, 0) initially and call that function again >> if single_arg and !can_vec_perm_p (...), that time with >> single_arg parameter false? If the original permutation is !can_vec_perm_p, I believe we should transform. That can indeed be done with a function. I did it without a function, but I can try to rewrite it if you want. > Or how about removing the code instead and doing it during > vector lowering if the original permute is not !can_vec_perm_p? I'd rather not delay that much. Here is a proposed patch that passed bootstrap+testsuite on x86_64-linux-gnu. I did not test on arm (or whichever platform it was that failed). If can_vec_perm_p is costly, we can test single_arg first (otherwise the 2 can_vec_perm_p must be the same) and test PROP_gimple_lvec before the second can_vec_perm_p (which must answer true then). I don't remember what the arg2 == op2 test is about, so I kept it. I also didn't try to fix TREE_SIDE_EFFECTS handling, a quick test didn't trigger that issue. 2014-11-03 Marc Glisse * fold-const.c: Include "optabs.h". (fold_ternary_loc) : Avoid canonicalizing a can_vec_perm_p permutation to one that is not. Index: gcc/fold-const.c =================================================================== --- gcc/fold-const.c (revision 217007) +++ gcc/fold-const.c (working copy) @@ -75,20 +75,21 @@ along with GCC; see the file COPYING3. #include "gimple.h" #include "gimplify.h" #include "tree-dfa.h" #include "hash-table.h" /* Required for ENABLE_FOLD_CHECKING. */ #include "builtins.h" #include "hash-map.h" #include "plugin-api.h" #include "ipa-ref.h" #include "cgraph.h" #include "generic-match.h" +#include "optabs.h" /* Nonzero if we are folding constants inside an initializer; zero otherwise. */ int folding_initializer = 0; /* The following constants represent a bit based encoding of GCC's comparison operators. This encoding simplifies transformations on relational comparison operators, such as AND and OR. */ enum comparison_code { COMPCODE_FALSE = 0, @@ -14189,47 +14190,47 @@ fold_ternary_loc (location_t loc, enum t return fold_build2_loc (loc, PLUS_EXPR, type, const_binop (MULT_EXPR, arg0, arg1), arg2); if (integer_zerop (arg2)) return fold_build2_loc (loc, MULT_EXPR, type, arg0, arg1); return fold_fma (loc, type, arg0, arg1, arg2); case VEC_PERM_EXPR: if (TREE_CODE (arg2) == VECTOR_CST) { - unsigned int nelts = TYPE_VECTOR_SUBPARTS (type), i, mask; + unsigned int nelts = TYPE_VECTOR_SUBPARTS (type), i, mask, mask2; unsigned char *sel = XALLOCAVEC (unsigned char, nelts); + unsigned char *sel2 = XALLOCAVEC (unsigned char, nelts); bool need_mask_canon = false; + bool need_mask_canon2 = false; bool all_in_vec0 = true; bool all_in_vec1 = true; bool maybe_identity = true; bool single_arg = (op0 == op1); bool changed = false; mask = single_arg ? (nelts - 1) : (2 * nelts - 1); + mask2 = 2 * nelts - 1; gcc_assert (nelts == VECTOR_CST_NELTS (arg2)); for (i = 0; i < nelts; i++) { tree val = VECTOR_CST_ELT (arg2, i); if (TREE_CODE (val) != INTEGER_CST) return NULL_TREE; /* Make sure that the perm value is in an acceptable range. */ wide_int t = val; - if (wi::gtu_p (t, mask)) - { - need_mask_canon = true; - sel[i] = t.to_uhwi () & mask; - } - else - sel[i] = t.to_uhwi (); + need_mask_canon |= wi::gtu_p (t, mask); + need_mask_canon2 |= wi::gtu_p (t, mask2); + sel[i] = t.to_uhwi () & mask; + sel2[i] = t.to_uhwi () & mask2; if (sel[i] < nelts) all_in_vec1 = false; else all_in_vec0 = false; if ((sel[i] & (nelts-1)) != i) maybe_identity = false; } @@ -14257,20 +14258,31 @@ fold_ternary_loc (location_t loc, enum t || TREE_CODE (op1) == CONSTRUCTOR)) { tree t = fold_vec_perm (type, op0, op1, sel); if (t != NULL_TREE) return t; } if (op0 == op1 && !single_arg) changed = true; + /* Some targets are deficient and fail to expand a single + argument permutation while still allowing an equivalent + 2-argument version. */ + if (need_mask_canon && arg2 == op2 + && !can_vec_perm_p (TYPE_MODE (type), false, sel) + && can_vec_perm_p (TYPE_MODE (type), false, sel2)) + { + need_mask_canon = need_mask_canon2; + sel = sel2; + } + if (need_mask_canon && arg2 == op2) { tree *tsel = XALLOCAVEC (tree, nelts); tree eltype = TREE_TYPE (TREE_TYPE (arg2)); for (i = 0; i < nelts; i++) tsel[i] = build_int_cst (eltype, sel[i]); op2 = build_vector (TREE_TYPE (arg2), tsel); changed = true; }