From patchwork Fri May 21 06:04:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 1482019 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=Rae07lYc; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4FmbgJ1D5kz9sSs for ; Fri, 21 May 2021 16:05:14 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E0F71386191C; Fri, 21 May 2021 06:05:11 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E0F71386191C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1621577111; bh=CzJ/SW5vMqaJ2FdtMHHNGgujF0TqmAkrBhEDPxJtkQw=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=Rae07lYcTFIjQHXVYuPwRc5s0z0gczFRnE9sXQ2Onv03bzj4eUi9bvexwj5lKNe3d mhfIs0G4u0rF7vBPeYgRgWGYkTA9nD9vdtw8lRyep1+YZfou4Lh23CFVOoYcf5IDgj SdJUsMpln5fsWFLlKrWdH2WVevLKsBwAN0ESA3gk= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qv1-xf32.google.com (mail-qv1-xf32.google.com [IPv6:2607:f8b0:4864:20::f32]) by sourceware.org (Postfix) with ESMTPS id 7EAC6385742D for ; Fri, 21 May 2021 06:05:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 7EAC6385742D Received: by mail-qv1-xf32.google.com with SMTP id h7so9219121qvs.12 for ; Thu, 20 May 2021 23:05:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=CzJ/SW5vMqaJ2FdtMHHNGgujF0TqmAkrBhEDPxJtkQw=; b=sn/x5MEGJgO7auiFHB/kGTS+CFEQIMLeyRJOH7EjtJ8QyyUUBFC0ZwfhPs5rEcpFZG TYogzF7EQSvScz2uuwnwiUhujw9KxQV9q+T0e3gAKofii9CyG80SvEfeJ4C9wYbwi+fH ADxv5CMEy0WX8tlATYoFz2FXYOa2YxbSo96Ft+oSi5W38Q552yvFRtCiilOM9zB5ILJ4 DA6WNx9MgZFhbBcd4DYuWuU5Q4J75KsUCRQImO+ccHWitNC7u5ZNKekLwtNMvXtKtrfP gv6Dz7mHSKg84q1rCxy2ORwpp8cqHEikEjCIzI6TkIj6lpQfze6KnQTkki0H75PhIcll KXUA== X-Gm-Message-State: AOAM53370Lk2pXigwWzvyPtDoaqNgKn9XeoCqEAePCVLMWZRKX4Sb1oC 8wNSJulFE5TdRUTsQFnW49mOionlWFi+MLLJmvSp1a0xrHnKfg== X-Google-Smtp-Source: ABdhPJw8KXLBMrv3C4y1k5X41iSamg0dlsLBMsalhGnoqLMlh+vx0cPtWGTFm57SKXFvl9+zuLMcp4JhrExFEk21hI0= X-Received: by 2002:a05:6214:18e5:: with SMTP id ep5mr10789174qvb.24.1621577106905; Thu, 20 May 2021 23:05:06 -0700 (PDT) MIME-Version: 1.0 Date: Fri, 21 May 2021 08:04:55 +0200 Message-ID: Subject: [PATCH] i386: Add minmax and abs patterns for 4-byte vectors [PR100637] To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Uros Bizjak via Gcc-patches From: Uros Bizjak Reply-To: Uros Bizjak Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" 2021-05-21 Uroš Bizjak gcc/ PR target/100637 * config/i386/mmx.md (SMAXMIN_MMXMODEI): New mode iterator. (3): Macroize expander from v4hi3> and 3 using SMAXMIN_MMXMODEI mode iterator. (*v4qi3): New insn pattern. (*v2hi3): Ditto. (SMAXMIN_VI_32): New mode iterator. (mode3): New expander. (UMAXMIN_MMXMODEI): New mode iterator. (3): Macroize expander from v8qi3> and 3 using UMAXMIN_MMXMODEI mode iterator. (*v4qi3): New insn pattern. (*v2hi3): Ditto. (UMAXMIN_VI_32): New mode iterator. (mode3): New expander. (abs2): New insn pattern. (ssse3_abs2, abs2): Move from ... * config/i386/sse.md: ... here. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to master. Uros. diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index baeed04d8c9..5e92be34545 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1691,13 +1691,11 @@ (define_insn "*sse2_umulv1siv1di3" (set_attr "type" "mmxmul,ssemul,ssemul") (set_attr "mode" "DI,TI,TI")]) -(define_expand "3" - [(set (match_operand:MMXMODE14 0 "register_operand") - (smaxmin:MMXMODE14 - (match_operand:MMXMODE14 1 "register_operand") - (match_operand:MMXMODE14 2 "register_operand")))] - "TARGET_MMX_WITH_SSE && TARGET_SSE4_1" - "ix86_fixup_binary_operands_no_copy (, mode, operands);") +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; +;; Parallel integral shifts +;; +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; (define_insn "*mmx_3" [(set (match_operand:MMXMODE14 0 "register_operand" "=Yr,*x,Yv") @@ -1725,14 +1723,6 @@ (define_expand "mmx_v4hi3" && (TARGET_SSE || TARGET_3DNOW_A)" "ix86_fixup_binary_operands_no_copy (, V4HImode, operands);") -(define_expand "v4hi3" - [(set (match_operand:V4HI 0 "register_operand") - (smaxmin:V4HI - (match_operand:V4HI 1 "register_operand") - (match_operand:V4HI 2 "register_operand")))] - "TARGET_MMX_WITH_SSE" - "ix86_fixup_binary_operands_no_copy (, V4HImode, operands);") - (define_insn "*mmx_v4hi3" [(set (match_operand:V4HI 0 "register_operand" "=y,x,Yw") (smaxmin:V4HI @@ -1750,14 +1740,58 @@ (define_insn "*mmx_v4hi3" (set_attr "type" "mmxadd,sseiadd,sseiadd") (set_attr "mode" "DI,TI,TI")]) +(define_mode_iterator SMAXMIN_MMXMODEI + [(V8QI "TARGET_SSE4_1") V4HI (V2SI "TARGET_SSE4_1")]) + (define_expand "3" - [(set (match_operand:MMXMODE24 0 "register_operand") - (umaxmin:MMXMODE24 - (match_operand:MMXMODE24 1 "register_operand") - (match_operand:MMXMODE24 2 "register_operand")))] - "TARGET_MMX_WITH_SSE && TARGET_SSE4_1" + [(set (match_operand:SMAXMIN_MMXMODEI 0 "register_operand") + (smaxmin:SMAXMIN_MMXMODEI + (match_operand:SMAXMIN_MMXMODEI 1 "register_operand") + (match_operand:SMAXMIN_MMXMODEI 2 "register_operand")))] + "TARGET_MMX_WITH_SSE" "ix86_fixup_binary_operands_no_copy (, mode, operands);") +(define_insn "*v4qi3" + [(set (match_operand:V4QI 0 "register_operand" "=Yr,*x,Yv") + (smaxmin:V4QI + (match_operand:V4QI 1 "register_operand" "%0,0,Yv") + (match_operand:V4QI 2 "register_operand" "Yr,*x,Yv")))] + "TARGET_SSE4_1 + && ix86_binary_operator_ok (, V4QImode, operands)" + "@ + pb\t{%2, %0|%0, %2} + pb\t{%2, %0|%0, %2} + vpb\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "type" "sseiadd") + (set_attr "prefix_extra" "1,1,*") + (set_attr "prefix" "orig,orig,vex") + (set_attr "mode" "TI")]) + +(define_insn "*v2hi3" + [(set (match_operand:V2HI 0 "register_operand" "=x,Yw") + (smaxmin:V2HI + (match_operand:V2HI 1 "register_operand" "%0,Yw") + (match_operand:V2HI 2 "register_operand" "x,Yw")))] + "TARGET_SSE2 + && ix86_binary_operator_ok (, V2HImode, operands)" + "@ + pw\t{%2, %0|%0, %2} + vpw\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sseiadd") + (set_attr "mode" "TI")]) + +(define_mode_iterator SMAXMIN_VI_32 [(V4QI "TARGET_SSE4_1") V2HI]) + +(define_expand "3" + [(set (match_operand:SMAXMIN_VI_32 0 "register_operand") + (smaxmin:SMAXMIN_VI_32 + (match_operand:SMAXMIN_VI_32 1 "register_operand") + (match_operand:SMAXMIN_VI_32 2 "register_operand")))] + "TARGET_SSE2" + "ix86_fixup_binary_operands_no_copy (, V4HImode, operands);") + (define_insn "*mmx_3" [(set (match_operand:MMXMODE24 0 "register_operand" "=Yr,*x,Yv") (umaxmin:MMXMODE24 @@ -1784,14 +1818,6 @@ (define_expand "mmx_v8qi3" && (TARGET_SSE || TARGET_3DNOW_A)" "ix86_fixup_binary_operands_no_copy (, V8QImode, operands);") -(define_expand "v8qi3" - [(set (match_operand:V8QI 0 "register_operand") - (umaxmin:V8QI - (match_operand:V8QI 1 "register_operand") - (match_operand:V8QI 2 "register_operand")))] - "TARGET_MMX_WITH_SSE" - "ix86_fixup_binary_operands_no_copy (, V8QImode, operands);") - (define_insn "*mmx_v8qi3" [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yw") (umaxmin:V8QI @@ -1809,6 +1835,97 @@ (define_insn "*mmx_v8qi3" (set_attr "type" "mmxadd,sseiadd,sseiadd") (set_attr "mode" "DI,TI,TI")]) +(define_mode_iterator UMAXMIN_MMXMODEI + [V8QI (V4HI "TARGET_SSE4_1") (V2SI "TARGET_SSE4_1")]) + +(define_expand "3" + [(set (match_operand:UMAXMIN_MMXMODEI 0 "register_operand") + (umaxmin:UMAXMIN_MMXMODEI + (match_operand:UMAXMIN_MMXMODEI 1 "register_operand") + (match_operand:UMAXMIN_MMXMODEI 2 "register_operand")))] + "TARGET_MMX_WITH_SSE" + "ix86_fixup_binary_operands_no_copy (, mode, operands);") + +(define_insn "*v4qi3" + [(set (match_operand:V4QI 0 "register_operand" "=x,Yw") + (umaxmin:V4QI + (match_operand:V4QI 1 "register_operand" "%0,Yw") + (match_operand:V4QI 2 "register_operand" "x,Yw")))] + "TARGET_SSE2 + && ix86_binary_operator_ok (, V4QImode, operands)" + "@ + pb\t{%2, %0|%0, %2} + vpb\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sseiadd") + (set_attr "mode" "TI")]) + +(define_insn "*v2hi3" + [(set (match_operand:V2HI 0 "register_operand" "=Yr,*x,Yv") + (umaxmin:V2HI + (match_operand:V2HI 1 "register_operand" "%0,0,Yv") + (match_operand:V2HI 2 "register_operand" "Yr,*x,Yv")))] + "TARGET_SSE4_1 + && ix86_binary_operator_ok (, V2HImode, operands)" + "@ + pw\t{%2, %0|%0, %2} + pw\t{%2, %0|%0, %2} + vpw\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "type" "sseiadd") + (set_attr "prefix_extra" "1,1,*") + (set_attr "prefix" "orig,orig,vex") + (set_attr "mode" "TI")]) + +(define_mode_iterator UMAXMIN_VI_32 [V4QI (V2HI "TARGET_SSE4_1")]) + +(define_expand "3" + [(set (match_operand:UMAXMIN_VI_32 0 "register_operand") + (umaxmin:UMAXMIN_VI_32 + (match_operand:UMAXMIN_VI_32 1 "register_operand") + (match_operand:UMAXMIN_VI_32 2 "register_operand")))] + "TARGET_SSE2" + "ix86_fixup_binary_operands_no_copy (, V4HImode, operands);") + +(define_insn "ssse3_abs2" + [(set (match_operand:MMXMODEI 0 "register_operand" "=y,Yv") + (abs:MMXMODEI + (match_operand:MMXMODEI 1 "register_mmxmem_operand" "ym,Yv")))] + "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" + "@ + pabs\t{%1, %0|%0, %1} + %vpabs\t{%1, %0|%0, %1}" + [(set_attr "mmx_isa" "native,*") + (set_attr "type" "sselog1") + (set_attr "prefix_rep" "0") + (set_attr "prefix_extra" "1") + (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) + (set_attr "mode" "DI,TI")]) + +(define_expand "abs2" + [(set (match_operand:MMXMODEI 0 "register_operand") + (abs:MMXMODEI + (match_operand:MMXMODEI 1 "register_operand")))] + "TARGET_MMX_WITH_SSE && TARGET_SSSE3") + +(define_insn "abs2" + [(set (match_operand:VI_32 0 "register_operand" "=Yv") + (abs:VI_32 + (match_operand:VI_32 1 "register_operand" "Yv")))] + "TARGET_SSSE3" + "%vpabs\t{%1, %0|%0, %1}" + [(set_attr "type" "sselog1") + (set_attr "prefix_rep" "0") + (set_attr "prefix_extra" "1") + (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) + (set_attr "mode" "TI")]) + +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; +;; +;; Parallel integral shifts +;; +;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + (define_insn "mmx_ashr3" [(set (match_operand:MMXMODE24 0 "register_operand" "=y,x,") (ashiftrt:MMXMODE24 diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 0f1108f0db1..7269147b87a 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -17553,27 +17553,6 @@ (define_expand "abs2" } }) -(define_insn "ssse3_abs2" - [(set (match_operand:MMXMODEI 0 "register_operand" "=y,Yv") - (abs:MMXMODEI - (match_operand:MMXMODEI 1 "register_mmxmem_operand" "ym,Yv")))] - "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" - "@ - pabs\t{%1, %0|%0, %1} - %vpabs\t{%1, %0|%0, %1}" - [(set_attr "mmx_isa" "native,*") - (set_attr "type" "sselog1") - (set_attr "prefix_rep" "0") - (set_attr "prefix_extra" "1") - (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) - (set_attr "mode" "DI,TI")]) - -(define_insn "abs2" - [(set (match_operand:MMXMODEI 0 "register_operand") - (abs:MMXMODEI - (match_operand:MMXMODEI 1 "register_operand")))] - "TARGET_MMX_WITH_SSE && TARGET_SSSE3") - ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; AMD SSE4A instructions