From patchwork Fri May 5 12:13:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 1777595 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=mvaJqo10; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QCV3x4dv3z20fg for ; Fri, 5 May 2023 22:14:36 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3DC353858D20 for ; Fri, 5 May 2023 12:14:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3DC353858D20 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1683288870; bh=QFUwaW9lq2GQTZvQ2ggfWwA70h5y75TClwAzTgqqPWQ=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=mvaJqo10U46vDBhuOt1vdYB3GioSUPaIFNoVvMRjikqpTtADBEIKngrtMcZFgjsNg pIjbJ9iFHqAorKNePskT2zdwelTP7lEqXlizlKbOd4Qf4LjuLcC2d9JBjxSaG4gG1Z AFL6nDpXITBWPdQ/HeQTiaIqYkCaBPw5JIC0/Ub4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qv1-xf2c.google.com (mail-qv1-xf2c.google.com [IPv6:2607:f8b0:4864:20::f2c]) by sourceware.org (Postfix) with ESMTPS id DF3873858D20 for ; Fri, 5 May 2023 12:14:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DF3873858D20 Received: by mail-qv1-xf2c.google.com with SMTP id 6a1803df08f44-61cd6191a62so7104816d6.3 for ; Fri, 05 May 2023 05:14:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683288849; x=1685880849; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=QFUwaW9lq2GQTZvQ2ggfWwA70h5y75TClwAzTgqqPWQ=; b=PpcvzzAHGQ6sApD5MBYkXwkXGH1jIGuNY19x2pljftJk+o3VoNmEWt4LGYR7H3akR1 JekQdhGXdhOCM29NsZE5QHcD6Wu6HUBEY4whY+2fftA9IerCPM08NbYRWi0PmqzDYMw0 B68fNQ2xwGfp+HX1HyQHpXJAd5t3AUJnRyRg/4KT+QPJmBkL9zn4MwSS5pzHAyrWMVRy PrhDqbRepT/s0IS7MLvPqy+4Y3g9/x/NN1K+PYMJBWseZ8Wut4Hxe09vP0y/p+aNfZD2 CPvqEzxyrfn1OUHh3nj3etpyddiJO33noKUrodu2iFBnjI2wEzQT/t4MTDoRC2FxkR/j EWpg== X-Gm-Message-State: AC+VfDxTMwesLc32EV3vcn/2mRaerLDQpAC3KJbAa6jKYYOvcKs9BEK/ rbFoekYJf7yo6Swk8GgWlxndj4g5BXNtL22rOhIVOL4jHh3nqw== X-Google-Smtp-Source: ACHHUZ40ylIt9QUlqe/wi5Y3FAoDskk1DDrT7149MOPcmVz6XTRYyM0551kFCLGlaD4dfWnegv9d7gjfPdj6t8NTdhY= X-Received: by 2002:ad4:5cce:0:b0:5e9:2d8c:9a06 with SMTP id iu14-20020ad45cce000000b005e92d8c9a06mr1655137qvb.39.1683288848862; Fri, 05 May 2023 05:14:08 -0700 (PDT) MIME-Version: 1.0 Date: Fri, 5 May 2023 14:13:57 +0200 Message-ID: Subject: [PATCH] i386: Introduce mulv2si3 instruction To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-7.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Uros Bizjak via Gcc-patches From: Uros Bizjak Reply-To: Uros Bizjak Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" For SSE2 targets the expander unpacks input elements into the correct position in the V4SI vector and emits PMULUDQ instruction. The output elements are then shuffled back to their positions in the V2SI vector. For SSE4 targets PMULLD instruction is emitted directly. gcc/ChangeLog: * config/i386/mmx.md (mulv2si3): New expander. (*mulv2si3): New insn pattern. gcc/testsuite/ChangeLog: * gcc.target/i386/sse2-mmx-mult-vec.c: New test. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to master. Uros. diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 872ddbc55f2..6dd203f4fa8 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -2092,6 +2092,55 @@ (define_insn "*3" (set_attr "type" "sseadd") (set_attr "mode" "TI")]) +(define_expand "mulv2si3" + [(set (match_operand:V2SI 0 "register_operand") + (mult:V2SI + (match_operand:V2SI 1 "register_operand") + (match_operand:V2SI 2 "register_operand")))] + "TARGET_MMX_WITH_SSE" +{ + if (!TARGET_SSE4_1) + { + rtx op1 = lowpart_subreg (V4SImode, force_reg (V2SImode, operands[1]), + V2SImode); + rtx op2 = lowpart_subreg (V4SImode, force_reg (V2SImode, operands[2]), + V2SImode); + + rtx tmp1 = gen_reg_rtx (V4SImode); + emit_insn (gen_vec_interleave_lowv4si (tmp1, op1, op1)); + rtx tmp2 = gen_reg_rtx (V4SImode); + emit_insn (gen_vec_interleave_lowv4si (tmp2, op2, op2)); + + rtx res = gen_reg_rtx (V2DImode); + emit_insn (gen_vec_widen_umult_even_v4si (res, tmp1, tmp2)); + + rtx op0 = gen_reg_rtx (V4SImode); + emit_insn (gen_sse2_pshufd_1 (op0, gen_lowpart (V4SImode, res), + const0_rtx, const2_rtx, + const0_rtx, const2_rtx)); + + emit_move_insn (operands[0], lowpart_subreg (V2SImode, op0, V4SImode)); + DONE; + } +}) + +(define_insn "*mulv2si3" + [(set (match_operand:V2SI 0 "register_operand" "=Yr,*x,v") + (mult:V2SI + (match_operand:V2SI 1 "register_operand" "%0,0,v") + (match_operand:V2SI 2 "register_operand" "Yr,*x,v")))] + "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE" + "@ + pmulld\t{%2, %0|%0, %2} + pmulld\t{%2, %0|%0, %2} + vpmulld\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "type" "sseimul") + (set_attr "prefix_extra" "1") + (set_attr "prefix" "orig,orig,vex") + (set_attr "btver2_decode" "vector") + (set_attr "mode" "TI")]) + (define_expand "mmx_mulv4hi3" [(set (match_operand:V4HI 0 "register_operand") (mult:V4HI (match_operand:V4HI 1 "register_mmxmem_operand") diff --git a/gcc/testsuite/gcc.target/i386/sse2-mmx-mult-vec.c b/gcc/testsuite/gcc.target/i386/sse2-mmx-mult-vec.c new file mode 100644 index 00000000000..cdc9a7bb8bf --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/sse2-mmx-mult-vec.c @@ -0,0 +1,27 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -ftree-vectorize -msse2" } */ +/* { dg-require-effective-target sse2 } */ + +#include "sse2-check.h" + +#define N 2 + +int a[N] = {-287807, 604344}; +int b[N] = {474362, 874120}; +int r[N]; + +int rc[N] = {914249338, -11800128}; + +static void +sse2_test (void) +{ + int i; + + for (i = 0; i < N; i++) + r[i] = a[i] * b[i]; + + /* check results: */ + for (i = 0; i < N; i++) + if (r[i] != rc[i]) + abort (); +}