From patchwork Wed Jan 12 19:06:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 1579293 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=Sz+Ncmhn; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4JYxrf0NgRz9s9c for ; Thu, 13 Jan 2022 06:07:12 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BC346386480C for ; Wed, 12 Jan 2022 19:07:08 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BC346386480C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1642014428; bh=IPH7uCQ/HIQdgbvVJbrt2uHHjlBdDH2oIFFz3TcYjhU=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=Sz+NcmhnKgt8ylM/vhImB7FsGdQDjHrfAkpphGrm+J96/UQ3kpg2B6f4rpbzesFSc I+Q7MGOp6/mpHkQ0LHFMUcZhzvT+Pl0/71XqYeKiAF1m1QeVdpbbLNEQgQXLPfaXd2 apk33r2qxRgHoJC0P/HlQenKvJp342otJM33s3wU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qt1-x830.google.com (mail-qt1-x830.google.com [IPv6:2607:f8b0:4864:20::830]) by sourceware.org (Postfix) with ESMTPS id 7F373385840C for ; Wed, 12 Jan 2022 19:06:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7F373385840C Received: by mail-qt1-x830.google.com with SMTP id c19so4169059qtx.3 for ; Wed, 12 Jan 2022 11:06:46 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=IPH7uCQ/HIQdgbvVJbrt2uHHjlBdDH2oIFFz3TcYjhU=; b=wGwd4YeKE2sQjZfRtNjD+GzHxAaL/EeJ+SKP8PJcAoFn6TzXY8nTc9ZSuHAwMLTi7c LzGAlC0EGwtNR/D/SCwdaB1IsrCOdJh7CpGoamdLEo/susYpoK1zM46BeyAcALm5B8Ae b6Wimn5vjm3r7d+QvqgKISMgbDQD7o8K47X/qeTn3Wbh5RvCXrf7xxoiNc1Wy74AGyLL A0KlkDIktXvEBv6VibSUXknaRW0WLGgWgy/8dR50MNt6Ko4d7K9nXXQCHv8DHNNP/Kmq 0YB939l5b+USK2/rC5PUWGJ4ymk9LTni8mm1apGOvRgKo2/7ROhh4MsvfOnZ1iPFQFBS JwOA== X-Gm-Message-State: AOAM530VToZgMx28J1zc5DgfSCciTe2CMTx5hS1e8rtyhD4Kxz/1/qUn Ows+eqmR9BEQ8lq4mu70rhHpLCq9xO3rkyL8EjywO+8am3CfsQ== X-Google-Smtp-Source: ABdhPJwTWYR9g9UNErrJhZIRjh2JZHweh3p2Mi+AmM+Ep9+Y++nkNSmpJ4IyV1L8i9jekqKt1hkYfNmIlpd4vrCDDeo= X-Received: by 2002:ac8:5e4a:: with SMTP id i10mr793163qtx.569.1642014405675; Wed, 12 Jan 2022 11:06:45 -0800 (PST) MIME-Version: 1.0 Date: Wed, 12 Jan 2022 20:06:34 +0100 Message-ID: Subject: [PATCH] i386: Add CC clobber and splits for 32-bit vector mode logic insns [PR100673, PR103861] To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-9.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Uros Bizjak via Gcc-patches From: Uros Bizjak Reply-To: Uros Bizjak Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Add CC clobber to 32-bit vector mode logic insns to allow variants with general-purpose registers. Also improve ix86_sse_movcc to emit insn with CC clobber for narrow vector modes in order to re-enable conditional moves for 16-bit and 32-bit narrow vector modes with -msse2. 2022-01-12 Uroš Bizjak gcc/ChangeLog: PR target/100637 PR target/103861 * config/i386/i386-expand.c (ix86_emit_vec_binop): New static function. (ix86_expand_sse_movcc): Use ix86_emit_vec_binop instead of gen_rtx_X when constructing vector logic RTXes. (expand_vec_perm_pshufb2): Ditto. * config/i386/mmx.md (negv2qi): Disparage GPR alternative a bit. (v2qi3): Ditto. (vcond): Re-enable for TARGET_SSE2. (vcondu): Ditto. (vcond_mask_): Ditto. (one_cmpl2): Remove expander. (one_cmpl2): Rename from one_cmplv2qi. Use VI_16_32 mode iterator. (one_cmpl2 splitters): Use VI_16_32 mode iterator. Use lowpart_subreg instead of gen_lowpart to create subreg. (*andnot3): Merge from "*andnot" and "*andnotv2qi3" insn patterns using VI_16_32 mode iterator. Disparage GPR alternative a bit. Add CC clobber. (*andnot3 splitters): Use VI_16_32 mode iterator. Use lowpart_subreg instead of gen_lowpart to create subreg. (*3): Merge from "*" and "*v2qi3 insn patterns" using VI_16_32 mode iterator. Disparage GPR alternative a bit. Add CC clobber. (*3 splitters):Use VI_16_32 mode iterator. Use lowpart_subreg instead of gen_lowpart to create subreg. gcc/testsuite/ChangeLog: PR target/100637 PR target/103861 * g++.target/i386/pr100637-1b.C (dg-options): Use -msse2 instead of -msse4.1. * g++.target/i386/pr100637-1w.C (dg-options): Ditto. * g++.target/i386/pr103861-1.C (dg-options): Ditto. * gcc.target/i386/pr100637-4b.c (dg-options): Ditto. * gcc.target/i386/pr103861-4.c (dg-options): Ditto. * gcc.target/i386/pr100637-1b.c: Remove scan-assembler directives for logic instructions. * gcc.target/i386/pr100637-1w.c: Ditto. * gcc.target/i386/warn-vect-op-2.c: Update dg-warning for vector logic operation. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to master. Uros. diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index 8b1266fb9f1..0318f126785 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -3752,6 +3752,27 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx cmp_op0, rtx cmp_op1, return dest; } +/* Emit x86 binary operand CODE in mode MODE for SSE vector + instructions that can be performed using GP registers. */ + +static void +ix86_emit_vec_binop (enum rtx_code code, machine_mode mode, + rtx dst, rtx src1, rtx src2) +{ + rtx tmp; + + tmp = gen_rtx_SET (dst, gen_rtx_fmt_ee (code, mode, src1, src2)); + + if (GET_MODE_SIZE (mode) <= GET_MODE_SIZE (SImode) + && GET_MODE_CLASS (mode) == MODE_VECTOR_INT) + { + rtx clob = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG)); + tmp = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, tmp, clob)); + } + + emit_insn (tmp); +} + /* Expand DEST = CMP ? OP_TRUE : OP_FALSE into a sequence of logical operations. This is used for both scalar and vector conditional moves. */ @@ -3820,23 +3841,20 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) else if (op_false == CONST0_RTX (mode)) { op_true = force_reg (mode, op_true); - x = gen_rtx_AND (mode, cmp, op_true); - emit_insn (gen_rtx_SET (dest, x)); + ix86_emit_vec_binop (AND, mode, dest, cmp, op_true); return; } else if (op_true == CONST0_RTX (mode)) { op_false = force_reg (mode, op_false); x = gen_rtx_NOT (mode, cmp); - x = gen_rtx_AND (mode, x, op_false); - emit_insn (gen_rtx_SET (dest, x)); + ix86_emit_vec_binop (AND, mode, dest, x, op_false); return; } else if (INTEGRAL_MODE_P (mode) && op_true == CONSTM1_RTX (mode)) { op_false = force_reg (mode, op_false); - x = gen_rtx_IOR (mode, cmp, op_false); - emit_insn (gen_rtx_SET (dest, x)); + ix86_emit_vec_binop (IOR, mode, dest, cmp, op_false); return; } else if (TARGET_XOP) @@ -4010,15 +4028,12 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) else t3 = dest; - x = gen_rtx_AND (mode, op_true, cmp); - emit_insn (gen_rtx_SET (t2, x)); + ix86_emit_vec_binop (AND, mode, t2, op_true, cmp); x = gen_rtx_NOT (mode, cmp); - x = gen_rtx_AND (mode, x, op_false); - emit_insn (gen_rtx_SET (t3, x)); + ix86_emit_vec_binop (AND, mode, t3, x, op_false); - x = gen_rtx_IOR (mode, t3, t2); - emit_insn (gen_rtx_SET (dest, x)); + ix86_emit_vec_binop (IOR, mode, dest, t3, t2); } } @@ -20733,7 +20748,7 @@ expand_vec_perm_pshufb2 (struct expand_vec_perm_d *d) op = d->target; if (d->vmode != mode) op = gen_reg_rtx (mode); - emit_insn (gen_rtx_SET (op, gen_rtx_IOR (mode, l, h))); + ix86_emit_vec_binop (IOR, mode, op, l, h); if (op != d->target) emit_move_insn (d->target, gen_lowpart (d->vmode, op)); diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index fa67278e003..8a8142c8a09 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1634,7 +1634,7 @@ "operands[2] = force_reg (mode, CONST0_RTX (mode));") (define_insn "negv2qi2" - [(set (match_operand:V2QI 0 "register_operand" "=Q,&Yw") + [(set (match_operand:V2QI 0 "register_operand" "=?Q,&Yw") (neg:V2QI (match_operand:V2QI 1 "register_operand" "0,Yw"))) (clobber (reg:CC FLAGS_REG))] @@ -1740,7 +1740,7 @@ (set_attr "mode" "TI")]) (define_insn "v2qi3" - [(set (match_operand:V2QI 0 "register_operand" "=Q,x,Yw") + [(set (match_operand:V2QI 0 "register_operand" "=?Q,x,Yw") (plusminus:V2QI (match_operand:V2QI 1 "register_operand" "0,0,Yw") (match_operand:V2QI 2 "register_operand" "Q,x,Yw"))) @@ -2587,7 +2587,7 @@ (match_operand:VI_16_32 5 "register_operand")]) (match_operand:VI_16_32 1) (match_operand:VI_16_32 2)))] - "TARGET_SSE4_1" + "TARGET_SSE2" { bool ok = ix86_expand_int_vcond (operands); gcc_assert (ok); @@ -2619,7 +2619,7 @@ (match_operand:VI_16_32 5 "register_operand")]) (match_operand:VI_16_32 1) (match_operand:VI_16_32 2)))] - "TARGET_SSE4_1" + "TARGET_SSE2" { bool ok = ix86_expand_int_vcond (operands); gcc_assert (ok); @@ -2645,7 +2645,7 @@ (match_operand:VI_16_32 1 "register_operand") (match_operand:VI_16_32 2 "register_operand") (match_operand:VI_16_32 3 "register_operand")))] - "TARGET_SSE4_1" + "TARGET_SSE2" { ix86_expand_sse_movcc (operands[0], operands[3], operands[1], operands[2]); @@ -2752,18 +2752,10 @@ "TARGET_MMX_WITH_SSE" "operands[2] = force_reg (mode, CONSTM1_RTX (mode));") -(define_expand "one_cmpl2" - [(set (match_operand:VI_32 0 "register_operand") - (xor:VI_32 - (match_operand:VI_32 1 "register_operand") - (match_dup 2)))] - "TARGET_SSE2" - "operands[2] = force_reg (mode, CONSTM1_RTX (mode));") - -(define_insn "one_cmplv2qi2" - [(set (match_operand:V2QI 0 "register_operand" "=r,&x,&v") - (not:V2QI - (match_operand:V2QI 1 "register_operand" "0,x,v")))] +(define_insn "one_cmpl2" + [(set (match_operand:VI_16_32 0 "register_operand" "=?r,&x,&v") + (not:VI_16_32 + (match_operand:VI_16_32 1 "register_operand" "0,x,v")))] "" "#" [(set_attr "isa" "*,sse2,avx512vl") @@ -2771,32 +2763,30 @@ (set_attr "mode" "SI,TI,TI")]) (define_split - [(set (match_operand:V2QI 0 "general_reg_operand") - (not:V2QI - (match_operand:V2QI 1 "general_reg_operand")))] + [(set (match_operand:VI_16_32 0 "general_reg_operand") + (not:VI_16_32 + (match_operand:VI_16_32 1 "general_reg_operand")))] "reload_completed" [(set (match_dup 0) (not:SI (match_dup 1)))] { - operands[1] = gen_lowpart (SImode, operands[1]); - operands[0] = gen_lowpart (SImode, operands[0]); + operands[1] = lowpart_subreg (SImode, operands[1], mode); + operands[0] = lowpart_subreg (SImode, operands[0], mode); }) (define_split - [(set (match_operand:V2QI 0 "sse_reg_operand") - (not:V2QI - (match_operand:V2QI 1 "sse_reg_operand")))] + [(set (match_operand:VI_16_32 0 "sse_reg_operand") + (not:VI_16_32 + (match_operand:VI_16_32 1 "sse_reg_operand")))] "TARGET_SSE2 && reload_completed" - [(set (match_dup 0) - (xor:V4QI + [(set (match_dup 0) (match_dup 2)) + (set (match_dup 0) + (xor:V16QI (match_dup 0) (match_dup 1)))] { - emit_insn - (gen_rtx_SET (gen_rtx_REG (V16QImode, REGNO (operands[0])), - CONSTM1_RTX (V16QImode))); - - operands[1] = gen_lowpart (V4QImode, operands[1]); - operands[0] = gen_lowpart (V4QImode, operands[0]); + operands[2] = CONSTM1_RTX (V16QImode); + operands[1] = lowpart_subreg (V16QImode, operands[1], mode); + operands[0] = lowpart_subreg (V16QImode, operands[0], mode); }) (define_insn "mmx_andnot3" @@ -2816,24 +2806,11 @@ (set_attr "mode" "DI,TI,TI,TI")]) (define_insn "*andnot3" - [(set (match_operand:VI_32 0 "register_operand" "=x,x,v") - (and:VI_32 - (not:VI_32 (match_operand:VI_32 1 "register_operand" "0,x,v")) - (match_operand:VI_32 2 "register_operand" "x,x,v")))] - "TARGET_SSE2" - "@ - pandn\t{%2, %0|%0, %2} - vpandn\t{%2, %1, %0|%0, %1, %2} - vpandnd\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,avx,avx512vl") - (set_attr "type" "sselog") - (set_attr "mode" "TI")]) - -(define_insn "*andnotv2qi3" - [(set (match_operand:V2QI 0 "register_operand" "=&r,r,x,x,v") - (and:V2QI - (not:V2QI (match_operand:V2QI 1 "register_operand" "0,r,0,x,v")) - (match_operand:V2QI 2 "register_operand" "r,r,x,x,v"))) + [(set (match_operand:VI_16_32 0 "register_operand" "=?&r,?r,x,x,v") + (and:VI_16_32 + (not:VI_16_32 + (match_operand:VI_16_32 1 "register_operand" "0,r,0,x,v")) + (match_operand:VI_16_32 2 "register_operand" "r,r,x,x,v"))) (clobber (reg:CC FLAGS_REG))] "" "#" @@ -2842,10 +2819,10 @@ (set_attr "mode" "SI,SI,TI,TI,TI")]) (define_split - [(set (match_operand:V2QI 0 "general_reg_operand") - (and:V2QI - (not:V2QI (match_operand:V2QI 1 "general_reg_operand")) - (match_operand:V2QI 2 "general_reg_operand"))) + [(set (match_operand:VI_16_32 0 "general_reg_operand") + (and:VI_16_32 + (not:VI_16_32 (match_operand:VI_16_32 1 "general_reg_operand")) + (match_operand:VI_16_32 2 "general_reg_operand"))) (clobber (reg:CC FLAGS_REG))] "TARGET_BMI && reload_completed" [(parallel @@ -2853,16 +2830,16 @@ (and:SI (not:SI (match_dup 1)) (match_dup 2))) (clobber (reg:CC FLAGS_REG))])] { - operands[2] = gen_lowpart (SImode, operands[2]); - operands[1] = gen_lowpart (SImode, operands[1]); - operands[0] = gen_lowpart (SImode, operands[0]); + operands[2] = lowpart_subreg (SImode, operands[2], mode); + operands[1] = lowpart_subreg (SImode, operands[1], mode); + operands[0] = lowpart_subreg (SImode, operands[0], mode); }) (define_split - [(set (match_operand:V2QI 0 "general_reg_operand") - (and:V2QI - (not:V2QI (match_operand:V2QI 1 "general_reg_operand")) - (match_operand:V2QI 2 "general_reg_operand"))) + [(set (match_operand:VI_16_32 0 "general_reg_operand") + (and:VI_16_32 + (not:VI_16_32 (match_operand:VI_16_32 1 "general_reg_operand")) + (match_operand:VI_16_32 2 "general_reg_operand"))) (clobber (reg:CC FLAGS_REG))] "!TARGET_BMI && reload_completed" [(set (match_dup 0) @@ -2872,24 +2849,24 @@ (and:SI (match_dup 0) (match_dup 2))) (clobber (reg:CC FLAGS_REG))])] { - operands[2] = gen_lowpart (SImode, operands[2]); - operands[1] = gen_lowpart (SImode, operands[1]); - operands[0] = gen_lowpart (SImode, operands[0]); + operands[2] = lowpart_subreg (SImode, operands[2], mode); + operands[1] = lowpart_subreg (SImode, operands[1], mode); + operands[0] = lowpart_subreg (SImode, operands[0], mode); }) (define_split - [(set (match_operand:V2QI 0 "sse_reg_operand") - (and:V2QI - (not:V2QI (match_operand:V2QI 1 "sse_reg_operand")) - (match_operand:V2QI 2 "sse_reg_operand"))) + [(set (match_operand:VI_16_32 0 "sse_reg_operand") + (and:VI_16_32 + (not:VI_16_32 (match_operand:VI_16_32 1 "sse_reg_operand")) + (match_operand:VI_16_32 2 "sse_reg_operand"))) (clobber (reg:CC FLAGS_REG))] "TARGET_SSE2 && reload_completed" [(set (match_dup 0) - (and:V4QI (not:V4QI (match_dup 1)) (match_dup 2)))] + (and:V16QI (not:V16QI (match_dup 1)) (match_dup 2)))] { - operands[2] = gen_lowpart (V4QImode, operands[2]); - operands[1] = gen_lowpart (V4QImode, operands[1]); - operands[0] = gen_lowpart (V4QImode, operands[0]); + operands[2] = lowpart_subreg (V16QImode, operands[2], mode); + operands[1] = lowpart_subreg (V16QImode, operands[1], mode); + operands[0] = lowpart_subreg (V16QImode, operands[0], mode); }) (define_expand "mmx_3" @@ -2925,24 +2902,10 @@ (set_attr "mode" "DI,TI,TI,TI")]) (define_insn "3" - [(set (match_operand:VI_32 0 "register_operand" "=x,x,v") - (any_logic:VI_32 - (match_operand:VI_32 1 "register_operand" "%0,x,v") - (match_operand:VI_32 2 "register_operand" "x,x,v")))] - "TARGET_SSE2" - "@ - p\t{%2, %0|%0, %2} - vp\t{%2, %1, %0|%0, %1, %2} - vpd\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,avx,avx512vl") - (set_attr "type" "sselog") - (set_attr "mode" "TI")]) - -(define_insn "v2qi3" - [(set (match_operand:V2QI 0 "register_operand" "=r,x,x,v") - (any_logic:V2QI - (match_operand:V2QI 1 "register_operand" "%0,0,x,v") - (match_operand:V2QI 2 "register_operand" "r,x,x,v"))) + [(set (match_operand:VI_16_32 0 "register_operand" "=?r,x,x,v") + (any_logic:VI_16_32 + (match_operand:VI_16_32 1 "register_operand" "%0,0,x,v") + (match_operand:VI_16_32 2 "register_operand" "r,x,x,v"))) (clobber (reg:CC FLAGS_REG))] "" "#" @@ -2951,10 +2914,10 @@ (set_attr "mode" "SI,TI,TI,TI")]) (define_split - [(set (match_operand:V2QI 0 "general_reg_operand") - (any_logic:V2QI - (match_operand:V2QI 1 "general_reg_operand") - (match_operand:V2QI 2 "general_reg_operand"))) + [(set (match_operand:VI_16_32 0 "general_reg_operand") + (any_logic:VI_16_32 + (match_operand:VI_16_32 1 "general_reg_operand") + (match_operand:VI_16_32 2 "general_reg_operand"))) (clobber (reg:CC FLAGS_REG))] "reload_completed" [(parallel @@ -2962,24 +2925,24 @@ (any_logic:SI (match_dup 1) (match_dup 2))) (clobber (reg:CC FLAGS_REG))])] { - operands[2] = gen_lowpart (SImode, operands[2]); - operands[1] = gen_lowpart (SImode, operands[1]); - operands[0] = gen_lowpart (SImode, operands[0]); + operands[2] = lowpart_subreg (SImode, operands[2], mode); + operands[1] = lowpart_subreg (SImode, operands[1], mode); + operands[0] = lowpart_subreg (SImode, operands[0], mode); }) (define_split - [(set (match_operand:V2QI 0 "sse_reg_operand") - (any_logic:V2QI - (match_operand:V2QI 1 "sse_reg_operand") - (match_operand:V2QI 2 "sse_reg_operand"))) + [(set (match_operand:VI_16_32 0 "sse_reg_operand") + (any_logic:VI_16_32 + (match_operand:VI_16_32 1 "sse_reg_operand") + (match_operand:VI_16_32 2 "sse_reg_operand"))) (clobber (reg:CC FLAGS_REG))] "TARGET_SSE2 && reload_completed" [(set (match_dup 0) - (any_logic:V4QI (match_dup 1) (match_dup 2)))] + (any_logic:V16QI (match_dup 1) (match_dup 2)))] { - operands[2] = gen_lowpart (V4QImode, operands[2]); - operands[1] = gen_lowpart (V4QImode, operands[1]); - operands[0] = gen_lowpart (V4QImode, operands[0]); + operands[2] = lowpart_subreg (V16QImode, operands[2], mode); + operands[1] = lowpart_subreg (V16QImode, operands[1], mode); + operands[0] = lowpart_subreg (V16QImode, operands[0], mode); }) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; diff --git a/gcc/testsuite/g++.target/i386/pr100637-1b.C b/gcc/testsuite/g++.target/i386/pr100637-1b.C index d602ac08b4d..35b5df7c9dd 100644 --- a/gcc/testsuite/g++.target/i386/pr100637-1b.C +++ b/gcc/testsuite/g++.target/i386/pr100637-1b.C @@ -1,6 +1,6 @@ /* PR target/100637 */ /* { dg-do compile } */ -/* { dg-options "-O2 -msse4" } */ +/* { dg-options "-O2 -msse2" } */ typedef unsigned char __attribute__((__vector_size__ (4))) __v4qu; typedef char __attribute__((__vector_size__ (4))) __v4qi; diff --git a/gcc/testsuite/g++.target/i386/pr100637-1w.C b/gcc/testsuite/g++.target/i386/pr100637-1w.C index c6056454897..a3ed06fddee 100644 --- a/gcc/testsuite/g++.target/i386/pr100637-1w.C +++ b/gcc/testsuite/g++.target/i386/pr100637-1w.C @@ -1,6 +1,6 @@ /* PR target/100637 */ /* { dg-do compile } */ -/* { dg-options "-O2 -msse4" } */ +/* { dg-options "-O2 -msse2" } */ typedef unsigned short __attribute__((__vector_size__ (4))) __v2hu; typedef short __attribute__((__vector_size__ (4))) __v2hi; diff --git a/gcc/testsuite/g++.target/i386/pr103861-1.C b/gcc/testsuite/g++.target/i386/pr103861-1.C index 940c939e04f..6475728991e 100644 --- a/gcc/testsuite/g++.target/i386/pr103861-1.C +++ b/gcc/testsuite/g++.target/i386/pr103861-1.C @@ -1,6 +1,6 @@ /* PR target/103861 */ /* { dg-do compile } */ -/* { dg-options "-O2 -msse4" } */ +/* { dg-options "-O2 -msse2" } */ typedef unsigned char __attribute__((__vector_size__ (2))) __v2qu; typedef char __attribute__((__vector_size__ (2))) __v2qi; diff --git a/gcc/testsuite/gcc.target/i386/pr100637-1b.c b/gcc/testsuite/gcc.target/i386/pr100637-1b.c index 3e7445ad9bb..f5b1c122a65 100644 --- a/gcc/testsuite/gcc.target/i386/pr100637-1b.c +++ b/gcc/testsuite/gcc.target/i386/pr100637-1b.c @@ -5,17 +5,14 @@ typedef char __v4qi __attribute__ ((__vector_size__ (4))); __v4qi and (__v4qi a, __v4qi b) { return a & b; }; -/* { dg-final { scan-assembler "andv4qi3" } } */ __v4qi andn (__v4qi a, __v4qi b) { return a & ~b; }; -/* { dg-final { scan-assembler "andnotv4qi3" } } */ __v4qi or (__v4qi a, __v4qi b) { return a | b; }; -/* { dg-final { scan-assembler "iorv4qi3" } } */ __v4qi xor (__v4qi a, __v4qi b) { return a ^ b; }; + __v4qi not (__v4qi a) { return ~a; }; -/* { dg-final { scan-assembler-times "xorv4qi3" 2 } } */ __v4qi plus (__v4qi a, __v4qi b) { return a + b; }; /* { dg-final { scan-assembler "addv4qi3" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr100637-1w.c b/gcc/testsuite/gcc.target/i386/pr100637-1w.c index fe6964044b6..5f2798878af 100644 --- a/gcc/testsuite/gcc.target/i386/pr100637-1w.c +++ b/gcc/testsuite/gcc.target/i386/pr100637-1w.c @@ -6,17 +6,14 @@ typedef short __v2hi __attribute__ ((__vector_size__ (4))); typedef unsigned short __v2hu __attribute__ ((__vector_size__ (4))); __v2hi and (__v2hi a, __v2hi b) { return a & b; }; -/* { dg-final { scan-assembler "andv2hi3" } } */ __v2hi andn (__v2hi a, __v2hi b) { return a & ~b; }; -/* { dg-final { scan-assembler "andnotv2hi3" } } */ __v2hi or (__v2hi a, __v2hi b) { return a | b; }; -/* { dg-final { scan-assembler "iorv2hi3" } } */ __v2hi xor (__v2hi a, __v2hi b) { return a ^ b; }; + __v2hi not (__v2hi a) { return ~a; }; -/* { dg-final { scan-assembler-times "xorv2hi3" 2 } } */ __v2hi plus (__v2hi a, __v2hi b) { return a + b; }; /* { dg-final { scan-assembler "addv2hi3" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr100637-4b.c b/gcc/testsuite/gcc.target/i386/pr100637-4b.c index add4506e4c1..198e3dd3352 100644 --- a/gcc/testsuite/gcc.target/i386/pr100637-4b.c +++ b/gcc/testsuite/gcc.target/i386/pr100637-4b.c @@ -1,6 +1,6 @@ /* PR target/100637 */ /* { dg-do compile } */ -/* { dg-options "-O2 -ftree-vectorize -msse4" } */ +/* { dg-options "-O2 -ftree-vectorize -msse2" } */ typedef char T; diff --git a/gcc/testsuite/gcc.target/i386/pr103861-4.c b/gcc/testsuite/gcc.target/i386/pr103861-4.c index 54c1859b027..54333697316 100644 --- a/gcc/testsuite/gcc.target/i386/pr103861-4.c +++ b/gcc/testsuite/gcc.target/i386/pr103861-4.c @@ -1,6 +1,6 @@ /* PR target/100637 */ /* { dg-do compile } */ -/* { dg-options "-O2 -ftree-vectorize -msse4" } */ +/* { dg-options "-O2 -ftree-vectorize -msse2" } */ typedef char T; diff --git a/gcc/testsuite/gcc.target/i386/warn-vect-op-2.c b/gcc/testsuite/gcc.target/i386/warn-vect-op-2.c index 4560f7070bb..5e378b6bd04 100644 --- a/gcc/testsuite/gcc.target/i386/warn-vect-op-2.c +++ b/gcc/testsuite/gcc.target/i386/warn-vect-op-2.c @@ -14,7 +14,7 @@ int main (int argc, char *argv[]) v0 + v1, /* { dg-warning "expanded piecewise" } */ v0 - v1, /* { dg-warning "expanded piecewise" } */ v0 > v1, /* { dg-warning "expanded piecewise" } */ - v0 & v1, /* { dg-warning "expanded piecewise" } */ + v0 & v1, /* { dg-warning "expanded in parallel" } */ __builtin_shuffle (v0, v1), /* { dg-warning "expanded piecewise" } */ __builtin_shuffle (v0, v1, v1) /* { dg-warning "expanded piecewise" } */ };