From patchwork Mon Feb 11 13:36:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Schmidt X-Patchwork-Id: 1039825 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-495812-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="QgsxQHay"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43ymz31jkTz9sMp for ; Tue, 12 Feb 2019 00:36:30 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:date:mime-version:message-id:content-type :content-transfer-encoding; q=dns; s=default; b=SDnO0BLrkxHUhwFf b2kzwxucOfjpu43QRNxnhOTLIoj9bUOBh1n3VIITSzoirDWvuugZCV1byY/kHk7e 8dUsR9AbA6lEKeOPvlW2HDbx3BQGdZ7wB4MV8JfOUNnpHN4dEFGG2RPPWyVNn7Fp NT5Pz+4wM5BjrQwIiUeNpq2o8Ug= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:date:mime-version:message-id:content-type :content-transfer-encoding; s=default; bh=QsM+NttMq4l7PAk/Y05Nfi +JTQo=; b=QgsxQHay4Wvw0Yti8rESLmvoSM9X3yDSUfMChp4ramq72i9RIvHMAk NLp0qm6OKPPls7Qm3TGqLTytAzVYzxZx3XrxpId/pBdlPlqblSDFxKNGkLJ7oEFH dEvMZThxBwJekmAIwbtS8bG7QB+N+VARW5U8US7cOqXxUOvhKm7Ho= Received: (qmail 86324 invoked by alias); 11 Feb 2019 13:36:22 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 86026 invoked by uid 89); 11 Feb 2019 13:36:22 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-12.6 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, HTML_MESSAGE, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 11 Feb 2019 13:36:18 +0000 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x1BDTS7J147316 for ; Mon, 11 Feb 2019 08:36:17 -0500 Received: from e17.ny.us.ibm.com (e17.ny.us.ibm.com [129.33.205.207]) by mx0a-001b2d01.pphosted.com with ESMTP id 2qk81eebfg-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 11 Feb 2019 08:36:16 -0500 Received: from localhost by e17.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 11 Feb 2019 13:36:15 -0000 Received: from b01cxnp23032.gho.pok.ibm.com (9.57.198.27) by e17.ny.us.ibm.com (146.89.104.204) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Mon, 11 Feb 2019 13:36:13 -0000 Received: from b01ledav006.gho.pok.ibm.com (b01ledav006.gho.pok.ibm.com [9.57.199.111]) by b01cxnp23032.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x1BDaC9B13369376 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 11 Feb 2019 13:36:12 GMT Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A4F31AC065; Mon, 11 Feb 2019 13:36:12 +0000 (GMT) Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 28444AC05E; Mon, 11 Feb 2019 13:36:12 +0000 (GMT) Received: from BigMac.local (unknown [9.85.143.212]) by b01ledav006.gho.pok.ibm.com (Postfix) with ESMTP; Mon, 11 Feb 2019 13:36:11 +0000 (GMT) To: GCC Patches Cc: Segher Boessenkool From: Bill Schmidt Subject: [PATCH, v2] rs6000: Vector shift-right should honor modulo semantics Date: Mon, 11 Feb 2019 07:36:11 -0600 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 x-cbid: 19021113-0040-0000-0000-000004BFBD30 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00010576; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000279; SDB=6.01159515; UDB=6.00604226; IPR=6.00940062; MB=3.00025525; MTD=3.00000008; XFM=3.00000015; UTC=2019-02-11 13:36:14 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19021113-0041-0000-0000-000008CADBD2 Message-Id: Hi! We had a problem report for code attempting to implement a vector right-shift for a vector long long (V2DImode) type. The programmer noted that we don't have a vector splat-immediate for this mode, but cleverly realized he could use a vector char splat-immediate since only the lower 6 bits of the shift count are read. This is a documented feature of both the vector shift built-ins and the underlying instruction. Starting with GCC 8, the vector shifts are expanded early in rs6000_gimple_fold_builtin. However, the GIMPLE folding does not currently perform the required TRUNC_MOD_EXPR to implement the built-in semantics. It appears that this was caught earlier for vector shift-left and fixed, but the same problem was not fixed for vector shift-right. This patch fixes that. I've added executable tests for both shift-right algebraic and shift-right logical. Both fail prior to applying the patch, and work correctly afterwards. Minor differences from v1 of this patch: * Deleted code to defer some vector splats to expand, which was unnecessary. * Removed powerpc64*-*-* target selector, added -mvsx option, removed extra braces from dg-options directive. * Added __attribute__ ((noinline)) to test_s*di_4 functions. * Corrected typoed function names. * Changed test case names. * Added vec-sld-modulo.c as requested (works both before and after this patch). Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions. Is this okay for trunk, and for GCC 8.3 if there is no fallout by the end of the week? Thanks, Bill [gcc] 2019-02-11 Bill Schmidt * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Shift-right and shift-left vector built-ins need to include a TRUNC_MOD_EXPR for correct semantics. [gcc/testsuite] 2019-02-11 Bill Schmidt * gcc.target/powerpc/vec-sld-modulo.c: New. * gcc.target/powerpc/vec-srad-modulo.c: New. * gcc.target/powerpc/vec-srd-modulo.c: New. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 268707) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -15735,13 +15735,37 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator * case ALTIVEC_BUILTIN_VSRAH: case ALTIVEC_BUILTIN_VSRAW: case P8V_BUILTIN_VSRAD: - arg0 = gimple_call_arg (stmt, 0); - arg1 = gimple_call_arg (stmt, 1); - lhs = gimple_call_lhs (stmt); - g = gimple_build_assign (lhs, RSHIFT_EXPR, arg0, arg1); - gimple_set_location (g, gimple_location (stmt)); - gsi_replace (gsi, g, true); - return true; + { + arg0 = gimple_call_arg (stmt, 0); + arg1 = gimple_call_arg (stmt, 1); + lhs = gimple_call_lhs (stmt); + tree arg1_type = TREE_TYPE (arg1); + tree unsigned_arg1_type = unsigned_type_for (TREE_TYPE (arg1)); + tree unsigned_element_type = unsigned_type_for (TREE_TYPE (arg1_type)); + location_t loc = gimple_location (stmt); + /* Force arg1 into the range valid matching the arg0 type. */ + /* Build a vector consisting of the max valid bit-size values. */ + int n_elts = VECTOR_CST_NELTS (arg1); + tree element_size = build_int_cst (unsigned_element_type, + 128 / n_elts); + tree_vector_builder elts (unsigned_arg1_type, n_elts, 1); + for (int i = 0; i < n_elts; i++) + elts.safe_push (element_size); + tree modulo_tree = elts.build (); + /* Modulo the provided shift value against that vector. */ + gimple_seq stmts = NULL; + tree unsigned_arg1 = gimple_build (&stmts, VIEW_CONVERT_EXPR, + unsigned_arg1_type, arg1); + tree new_arg1 = gimple_build (&stmts, loc, TRUNC_MOD_EXPR, + unsigned_arg1_type, unsigned_arg1, + modulo_tree); + gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); + /* And finally, do the shift. */ + g = gimple_build_assign (lhs, RSHIFT_EXPR, arg0, new_arg1); + gimple_set_location (g, loc); + gsi_replace (gsi, g, true); + return true; + } /* Flavors of vector shift left. builtin_altivec_vsl{b,h,w} -> vsl{b,h,w}. */ case ALTIVEC_BUILTIN_VSLB: @@ -15795,14 +15819,34 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator * arg0 = gimple_call_arg (stmt, 0); arg1 = gimple_call_arg (stmt, 1); lhs = gimple_call_lhs (stmt); + tree arg1_type = TREE_TYPE (arg1); + tree unsigned_arg1_type = unsigned_type_for (TREE_TYPE (arg1)); + tree unsigned_element_type = unsigned_type_for (TREE_TYPE (arg1_type)); + location_t loc = gimple_location (stmt); gimple_seq stmts = NULL; /* Convert arg0 to unsigned. */ tree arg0_unsigned = gimple_build (&stmts, VIEW_CONVERT_EXPR, unsigned_type_for (TREE_TYPE (arg0)), arg0); + /* Force arg1 into the range valid matching the arg0 type. */ + /* Build a vector consisting of the max valid bit-size values. */ + int n_elts = VECTOR_CST_NELTS (arg1); + tree element_size = build_int_cst (unsigned_element_type, + 128 / n_elts); + tree_vector_builder elts (unsigned_arg1_type, n_elts, 1); + for (int i = 0; i < n_elts; i++) + elts.safe_push (element_size); + tree modulo_tree = elts.build (); + /* Modulo the provided shift value against that vector. */ + tree unsigned_arg1 = gimple_build (&stmts, VIEW_CONVERT_EXPR, + unsigned_arg1_type, arg1); + tree new_arg1 = gimple_build (&stmts, loc, TRUNC_MOD_EXPR, + unsigned_arg1_type, unsigned_arg1, + modulo_tree); + /* Do the shift. */ tree res = gimple_build (&stmts, RSHIFT_EXPR, - TREE_TYPE (arg0_unsigned), arg0_unsigned, arg1); + TREE_TYPE (arg0_unsigned), arg0_unsigned, new_arg1); /* Convert result back to the lhs type. */ res = gimple_build (&stmts, VIEW_CONVERT_EXPR, TREE_TYPE (lhs), res); gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); Index: gcc/testsuite/gcc.target/powerpc/vec-sld-modulo.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-sld-modulo.c (nonexistent) +++ gcc/testsuite/gcc.target/powerpc/vec-sld-modulo.c (working copy) @@ -0,0 +1,42 @@ +/* Test that using a character splat to set up a shift-left + for a doubleword vector works correctly after gimple folding. */ + +/* { dg-do run { target { vsx_hw } } } */ +/* { dg-options "-O2 -mvsx" } */ + +#include + +typedef __vector unsigned long long vui64_t; + +static inline vui64_t +vec_sldi (vui64_t vra, const unsigned int shb) +{ + vui64_t lshift; + vui64_t result; + + /* Note legitimate use of wrong-type splat due to expectation that only + lower 6-bits are read. */ + lshift = (vui64_t) vec_splat_s8((const unsigned char)shb); + + /* Vector Shift Left Doublewords based on the lower 6-bits + of corresponding element of lshift. */ + result = vec_vsld (vra, lshift); + + return (vui64_t) result; +} + +__attribute__ ((noinline)) vui64_t +test_sldi_4 (vui64_t a) +{ + return vec_sldi (a, 4); +} + +int +main () +{ + vui64_t x = {-256, 1025}; + x = test_sldi_4 (x); + if (x[0] != -4096 || x[1] != 16400) + __builtin_abort (); + return 0; +} Index: gcc/testsuite/gcc.target/powerpc/vec-srad-modulo.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-srad-modulo.c (nonexistent) +++ gcc/testsuite/gcc.target/powerpc/vec-srad-modulo.c (working copy) @@ -0,0 +1,43 @@ +/* Test that using a character splat to set up a shift-right algebraic + for a doubleword vector works correctly after gimple folding. */ + +/* { dg-do run { target { vsx_hw } } } */ +/* { dg-options "-O2 -mvsx" } */ + +#include + +typedef __vector unsigned long long vui64_t; +typedef __vector long long vi64_t; + +static inline vi64_t +vec_sradi (vi64_t vra, const unsigned int shb) +{ + vui64_t rshift; + vi64_t result; + + /* Note legitimate use of wrong-type splat due to expectation that only + lower 6-bits are read. */ + rshift = (vui64_t) vec_splat_s8((const unsigned char)shb); + + /* Vector Shift Right Algebraic Doublewords based on the lower 6-bits + of corresponding element of rshift. */ + result = vec_vsrad (vra, rshift); + + return (vi64_t) result; +} + +__attribute__ ((noinline)) vi64_t +test_sradi_4 (vi64_t a) +{ + return vec_sradi (a, 4); +} + +int +main () +{ + vi64_t x = {-256, 1025}; + x = test_sradi_4 (x); + if (x[0] != -16 || x[1] != 64) + __builtin_abort (); + return 0; +} Index: gcc/testsuite/gcc.target/powerpc/vec-srd-modulo.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-srd-modulo.c (nonexistent) +++ gcc/testsuite/gcc.target/powerpc/vec-srd-modulo.c (working copy) @@ -0,0 +1,42 @@ +/* Test that using a character splat to set up a shift-right logical + for a doubleword vector works correctly after gimple folding. */ + +/* { dg-do run { target { vsx_hw } } } */ +/* { dg-options "-O2 -mvsx" } */ + +#include + +typedef __vector unsigned long long vui64_t; + +static inline vui64_t +vec_srdi (vui64_t vra, const unsigned int shb) +{ + vui64_t rshift; + vui64_t result; + + /* Note legitimate use of wrong-type splat due to expectation that only + lower 6-bits are read. */ + rshift = (vui64_t) vec_splat_s8((const unsigned char)shb); + + /* Vector Shift Right [Logical] Doublewords based on the lower 6-bits + of corresponding element of rshift. */ + result = vec_vsrd (vra, rshift); + + return (vui64_t) result; +} + +__attribute__ ((noinline)) vui64_t +test_srdi_4 (vui64_t a) +{ + return vec_srdi (a, 4); +} + +int +main () +{ + vui64_t x = {1992357, 1025}; + x = test_srdi_4 (x); + if (x[0] != 124522 || x[1] != 64) + __builtin_abort (); + return 0; +}