From patchwork Fri Oct 20 17:25:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Prathamesh Kulkarni X-Patchwork-Id: 1852722 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=TBrHrsl6; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SBs2T2jTDz23jM for ; Sat, 21 Oct 2023 04:26:37 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4EDB0385843A for ; Fri, 20 Oct 2023 17:26:35 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by sourceware.org (Postfix) with ESMTPS id ECE5A3858D39 for ; Fri, 20 Oct 2023 17:26:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ECE5A3858D39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org ECE5A3858D39 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::42d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697822785; cv=none; b=QIt1SyO/1F7VFRQQSl1SnhZSH/fHPMIbo+BR3702jcK/IYDMyoGeHnmqz2EFjHQcMkdUIK9WSSNEJ0DZwvk20WFIjn/KMWUapLj5vYGt4p7QvVmzeahdSam/au7/A9CY4fNlGgxm1i4aZPyznVRDevznn5WjtGnfJvgrlIjtdqY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697822785; c=relaxed/simple; bh=eK4Z6DjId4Oy4hdHMeHaMy0KRv79bIU0YgcCr+/0wok=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=K47LmPW3K9dYy5NgQ9mFgepGfSCBGUtEVLQEvqBZ6scUijgALgOSY/maVtcNO0qFnY9lrLspDQxbR1chSHoq8HsryQ1ZL8kJOTyNiesxHkrG8dhJKP7pAYg59W1cd6GZRkwaZTC2hqJ30rqmN/Qjov/DEAZ+ELHK+9ta+hP5pRg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wr1-x42d.google.com with SMTP id ffacd0b85a97d-32d834ec222so758624f8f.0 for ; Fri, 20 Oct 2023 10:26:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1697822781; x=1698427581; darn=gcc.gnu.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=aQ9aZjjbXIH2N3Ecl+D98CTn8zA5On5b96/W48IWpp8=; b=TBrHrsl60YNQQ1JpUaxQ0dIs4w6XsoDxTwgemipZkJeFLT7aIPcX9RM4I6qQtaUFus 38obGyXOPbLSikDHy0K0ZbWFhFZxhnK9eC8VExa7mUVBks08OP/y4ZFR8vmXrUAwLzzj WbW1vfv2BgwY9e5TiHC6vtGNriByMRbM99yGgst/YxpYVn2ET4euO4hOg43rSkajpyU2 zN/TJVHRqFYJSKHT1ZnLjqEI6eSDkNM3N4I3B/timVanrvS+q4kprxkFjpFaya+lYZRN FU0rYBYKSu3EaiyohsQBR2hphGY55vUCvr4xZkaJ0m7oGG52QNBfKseEzMxpjJwRpuUB M+1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697822781; x=1698427581; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=aQ9aZjjbXIH2N3Ecl+D98CTn8zA5On5b96/W48IWpp8=; b=PcUuGjmfvKUUpe+vY+dk1lNpYKGR9p6TNYYE8q6EDFnbPDrDtW2HnP/kv7ltUebb5y aS2oO1BpewF18xnAcdZkPCboz51o2OCm+0g/Y4H8rawjEusISYQBZjHVkYPJhQ7G13Rh 0Dm1GArpTFXxeuJvHlxbkImyRbHVukq5VBhqvCoj1DJcmjNuwNQ1UB/6g+G/nLKUKF71 Wby5wIv7XW5z2RwPDEdVrm6VC5UNkICF36Bp9Z4ZZJVXn60CpJSK3BaWf4uaciQwgDeB 4mfGUDiGVuIs+bR7vlrtwAFOL1aXFYprD5TbOw45lItPBZmsMnmmrF/U9psI+5rQe/tK Yl9w== X-Gm-Message-State: AOJu0Yx64RruEDYTHltQVDob6MMCzvYepgneJenwH9whT3hdDU7dkMcZ GJfuVRqutnAKbPWXQbRD7j2kE2zTr9KAf5DM1x+qsPJiQxe7rn+0BtM= X-Google-Smtp-Source: AGHT+IHWxzMEjhiTiIP9e2UkYICWFoJW8Q0+HitL2SP3IE3dVrNm/jp8pAcMyktqZuouT56CPCuwatqiW5TWWDimm7M= X-Received: by 2002:adf:e80a:0:b0:32d:8cfd:5780 with SMTP id o10-20020adfe80a000000b0032d8cfd5780mr1678100wrm.27.1697822780825; Fri, 20 Oct 2023 10:26:20 -0700 (PDT) MIME-Version: 1.0 From: Prathamesh Kulkarni Date: Fri, 20 Oct 2023 22:55:44 +0530 Message-ID: Subject: PR111754 To: gcc Patches , Richard Sandiford , Richard Biener X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, For the following test-case: typedef float __attribute__((__vector_size__ (16))) F; F foo (F a, F b) { F v = (F) { 9 }; return __builtin_shufflevector (v, v, 1, 0, 1, 2); } Compiling with -O2 results in following ICE: foo.c: In function ‘foo’: foo.c:6:10: internal compiler error: in decompose, at rtl.h:2314 6 | return __builtin_shufflevector (v, v, 1, 0, 1, 2); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 0x7f3185 wi::int_traits >::decompose(long*, unsigned int, std::pair const&) ../../gcc/gcc/rtl.h:2314 0x7f3185 wide_int_ref_storage::wide_int_ref_storage >(std::pair const&) ../../gcc/gcc/wide-int.h:1089 0x7f3185 generic_wide_int >::generic_wide_int >(std::pair const&) ../../gcc/gcc/wide-int.h:847 0x7f3185 poly_int<1u, generic_wide_int > >::poly_int >(poly_int_full, std::pair const&) ../../gcc/gcc/poly-int.h:467 0x7f3185 poly_int<1u, generic_wide_int > >::poly_int >(std::pair const&) ../../gcc/gcc/poly-int.h:453 0x7f3185 wi::to_poly_wide(rtx_def const*, machine_mode) ../../gcc/gcc/rtl.h:2383 0x7f3185 rtx_vector_builder::step(rtx_def*, rtx_def*) const ../../gcc/gcc/rtx-vector-builder.h:122 0xfd4e1b vector_builder::elt(unsigned int) const ../../gcc/gcc/vector-builder.h:253 0xfd4d11 rtx_vector_builder::build() ../../gcc/gcc/rtx-vector-builder.cc:73 0xc21d9c const_vector_from_tree ../../gcc/gcc/expr.cc:13487 0xc21d9c expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) ../../gcc/gcc/expr.cc:11059 0xaee682 expand_expr(tree_node*, rtx_def*, machine_mode, expand_modifier) ../../gcc/gcc/expr.h:310 0xaee682 expand_return ../../gcc/gcc/cfgexpand.cc:3809 0xaee682 expand_gimple_stmt_1 ../../gcc/gcc/cfgexpand.cc:3918 0xaee682 expand_gimple_stmt ../../gcc/gcc/cfgexpand.cc:4044 0xaf28f0 expand_gimple_basic_block ../../gcc/gcc/cfgexpand.cc:6100 0xaf4996 execute ../../gcc/gcc/cfgexpand.cc:6835 IIUC, the issue is that fold_vec_perm returns a vector having float element type with res_nelts_per_pattern == 3, and later ICE's when it tries to derive element v[3], not present in the encoding, while trying to build rtx vector in rtx_vector_builder::build(): for (unsigned int i = 0; i < nelts; ++i) RTVEC_ELT (v, i) = elt (i); The attached patch tries to fix this by returning false from valid_mask_for_fold_vec_perm_cst if sel has a stepped sequence and input vector has non-integral element type, so for VLA vectors, it will only build result with dup sequence (nelts_per_pattern < 3) for non-integral element type. For VLS vectors, this will still work for stepped sequence since it will then use the "VLS exception" in fold_vec_perm_cst, and set: res_npattern = res_nelts and res_nelts_per_pattern = 1 and fold the above case to: F foo (F a, F b) { [local count: 1073741824]: return { 0.0, 9.0e+0, 0.0, 0.0 }; } But I am not sure if this is entirely correct, since: tree res = out_elts.build (); will canonicalize the encoding and may result in a stepped sequence (vector_builder::finalize() may reduce npatterns at the cost of increasing nelts_per_pattern) ? PS: This issue is now latent after PR111648 fix, since valid_mask_for_fold_vec_perm_cst with sel = {1, 0, 1, ...} returns false because the corresponding pattern in arg0 is not a natural stepped sequence, and folds correctly using VLS exception. However, I guess the underlying issue of dealing with non-integral element types in fold_vec_perm_cst still remains ? The patch passes bootstrap+test with and without SVE on aarch64-linux-gnu, and on x86_64-linux-gnu. Thanks, Prathamesh diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index 82299bb7f1d..cedfc9616e9 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -10642,6 +10642,11 @@ valid_mask_for_fold_vec_perm_cst_p (tree arg0, tree arg1, if (sel_nelts_per_pattern < 3) return true; + /* If SEL contains stepped sequence, ensure that we are dealing with + integral vector_cst. */ + if (!INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (arg0)))) + return false; + for (unsigned pattern = 0; pattern < sel_npatterns; pattern++) { poly_uint64 a1 = sel[pattern + sel_npatterns]; diff --git a/gcc/testsuite/gcc.dg/vect/pr111754.c b/gcc/testsuite/gcc.dg/vect/pr111754.c new file mode 100644 index 00000000000..7c1c16875c7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr111754.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +typedef float __attribute__((__vector_size__ (16))) F; + +F foo (F a, F b) +{ + F v = (F) { 9 }; + return __builtin_shufflevector (v, v, 1, 0, 1, 2); +} + +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized" } } */ +/* { dg-final { scan-tree-dump "return \{ 0.0, 9.0e\\+0, 0.0, 0.0 \}" "optimized" } } */