rs6000: correct BE vextract_fp_from_short[hl] vperm mask

Message ID	CAGWvnykKFJDZ9AdtfcWAE6dcbxXFmq3P8WqaqOGLzTcHyGcA0g@mail.gmail.com
State	New
Headers	show Return-Path: <gcc-patches-bounces@gcc.gnu.org> DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 01FDF3854826 MIME-Version: 1.0 Date: Mon, 19 Oct 2020 09:16:50 -0400 Message-ID: <CAGWvnykKFJDZ9AdtfcWAE6dcbxXFmq3P8WqaqOGLzTcHyGcA0g@mail.gmail.com> Subject: [PATCH] rs6000: correct BE vextract_fp_from_short[hl] vperm mask To: GCC Patches <gcc-patches@gcc.gnu.org> Content-Type: text/plain; charset="UTF-8" Precedence: list From: David Edelsohn via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: David Edelsohn <dje.gcc@gmail.com> Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces@gcc.gnu.org>
Series	rs6000: correct BE vextract_fp_from_short[hl] vperm mask \| expand rs6000: correct BE vextract_fp_from_short[hl] vperm mask

Message ID

CAGWvnykKFJDZ9AdtfcWAE6dcbxXFmq3P8WqaqOGLzTcHyGcA0g@mail.gmail.com

State

New

Headers

DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 01FDF3854826
MIME-Version: 1.0
Date: Mon, 19 Oct 2020 09:16:50 -0400
Message-ID: 
 <CAGWvnykKFJDZ9AdtfcWAE6dcbxXFmq3P8WqaqOGLzTcHyGcA0g@mail.gmail.com>
Subject: [PATCH] rs6000: correct BE vextract_fp_from_short[hl] vperm mask
To: GCC Patches <gcc-patches@gcc.gnu.org>
Content-Type: text/plain; charset="UTF-8"
Precedence: list
From: David Edelsohn via Gcc-patches <gcc-patches@gcc.gnu.org>
Reply-To: David Edelsohn <dje.gcc@gmail.com>
Errors-To: gcc-patches-bounces@gcc.gnu.org
Sender: "Gcc-patches" <gcc-patches-bounces@gcc.gnu.org>

Series

rs6000: correct BE vextract_fp_from_short[hl] vperm mask | expand

Commit Message

David Edelsohn Oct. 19, 2020, 1:16 p.m. UTC

xvcvhpsp instruction converts a vector of bfloat16 half precision to single
    precision.  The intrinsics vextract_fp_from_shorth and
    vextract_fp_from_shortl select the high or low four elements of a
    half precision vector to convert.  The intrinsics use vperm to select
    the appropriate portion of the half precision vector and redistribute
    the values for the xvcvhpsp instruction.  The big endian versions of the
    masks for the intrinsics were initialized wrong.  This patch replaces the
    masks with the correct values.  This corrects the failure of
    builtins-3-p9-runnable.c testcase on big endian systems.

    Bootstrapped powerpc-ibm-aix7.2.3.0 Power9.  Committed.

    gcc/ChangeLog:

            * config/rs6000/vsx.md (vextract_fp_from_shorth):  Fix vals_be.
            (vextract_fp_from_shortl) Same.

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 4ff52455fd3..c023bc0baaa 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -5659,7 +5659,7 @@  (define_expand "vextract_fp_from_shorth"
 {
   int i;
   int vals_le[16] = {15, 14, 0, 0, 13, 12, 0, 0, 11, 10, 0, 0, 9, 8, 0, 0};
-  int vals_be[16] = {7, 6, 0, 0, 5, 4, 0, 0, 3, 2, 0, 0, 1, 0, 0, 0};
+  int vals_be[16] = {0, 0, 0, 1, 0, 0, 2, 3, 0, 0, 4, 5, 0, 0, 6, 7};

   rtx rvals[16];
   rtx mask = gen_reg_rtx (V16QImode);
@@ -5693,7 +5693,7 @@  (define_expand "vextract_fp_from_shortl"
   "TARGET_P9_VECTOR"
 {
   int vals_le[16] = {7, 6, 0, 0, 5, 4, 0, 0, 3, 2, 0, 0, 1, 0, 0, 0};
-  int vals_be[16] = {15, 14, 0, 0, 13, 12, 0, 0, 11, 10, 0, 0, 9, 8, 0, 0};
+  int vals_be[16] = {0, 0, 8, 9, 0, 0, 10, 11, 0, 0, 12, 13, 0, 0, 14, 15};

   int i;
   rtx rvals[16];