From patchwork Wed Dec 11 03:21:18 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 1207438 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-515667-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="V/2RokhM"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="n75xkk0W"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47Xj3p3VXhz9sR7 for ; Wed, 11 Dec 2019 14:25:08 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:cc:content-type; q=dns; s=default; b=bPm+J9jSK+EpUq7DKLv+h1D/CqnPSvNY7YO4Ta5n8Xs nwYQlFoxXRzBv3H1nUxZ5h4sdnPcmjbZ61joXCwUPsS3XwU6+SIjWQAEctP9HZlW vqoJ5if19Y5REp0zeAiI/uSHT8CQ0hgyd3bDDzFtADVai6Y3igpk8skDrYBUCgSI = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:cc:content-type; s=default; bh=BDivivNLS/MDFY2Sm5xaV76qjpg=; b=V/2RokhMbDiJ1x0t6 pCWwLGCpBALeE3MwiXRe4ZGtaUCCvdFuBRBngKDdjGG6Td9NuearMXJbmgVU6zNl v6SkVb5XdoZaYAormxT/6Qau2uzZoPzd7v5BDB35BYfjdwlQg3ma20ud8Ys/oqAS RifjplIFN5dLXoDDjpYpa12vmU= Received: (qmail 82385 invoked by alias); 11 Dec 2019 03:24:59 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 82235 invoked by uid 89); 11 Dec 2019 03:24:49 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-10.4 required=5.0 tests=AWL, BAYES_00, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.1 spammy=rshift X-HELO: mail-wr1-f67.google.com Received: from mail-wr1-f67.google.com (HELO mail-wr1-f67.google.com) (209.85.221.67) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 11 Dec 2019 03:24:33 +0000 Received: by mail-wr1-f67.google.com with SMTP id z3so22419495wru.3 for ; Tue, 10 Dec 2019 19:24:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to:cc; bh=I3zbtwtESoBlCy72PwsjuXCrEE4QLMIlpyOeEl1VdB4=; b=n75xkk0WmZOIYEFzD74ZM2eP76yLgdXs99Jgk1wHg218YbJRVUU2jf8oOjCCRpc7C2 +scOjWiC6++W5k0nSmCH5QG+ERGhwT6vQRXTYPsbS9oU2IiC0W5wQ1cNNFEeYE7XKkhM XfJIRSXuVWvt/t/2Ebo5GWNr7qrjmoRTOgiiv5byozaZlj6SX5sKQNqdeVixtRIe7DQO ikZionRh+S++LnWq5wme433t2KlXnCMImuc5H/e6QprRyWw8SBPU8Jd17t7GjDEB+bO3 VKPSgY+Iyj23JhKpX+Ph5UrQThDIvYSLFWLoG1q4AWfwjyLfNan1O//gg/C4j8uSvDgY gD7g== MIME-Version: 1.0 From: =?utf-8?b?546p6L+Y5pyJ?= Date: Wed, 11 Dec 2019 11:21:18 +0800 Message-ID: Subject: [PATCH] Add abs pattern to handle {si, di} mode abs to avoid pmax/cmove conversion (PR92651) To: ubizjak@gmail.com, gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, hjl.tools@gmail.com, crazylht@gmail.com Hi: Currently smax/smin pattern added by r274481 cause some regression in 525.x264_r by 8% with -O2 -march=corei7. The reason is some IA backends (contain TARGET_SSE4_1) will do transform for simple abs (using rshift, xor and sub) to pmax/pmin if smax/smin pattern exists, which generate unnecessary sse instruction. This patch adds abs patterns to generate simple abs for integer mode to recover the regression. Bootstrap ok, regression test on i386 backend is ok. Ok for trunk? Changelog gcc/ PR target/92651 * config/i386/i386.h (TARGET_USE_SIMPLE_ABS_PATTERN): New macro. * config/i386/x86-tune.def (X86_TUNE_USE_SIMPLE_ABS_PATTERN): New * config/i386/i386.md (abs2): New define_expand. gcc/testsuite * gcc.target/i386/pr92651.c: New testcase. Regards, Hongyu, Wang From c0bf64efbcaa989349130b676880cc2ed49fca69 Mon Sep 17 00:00:00 2001 From: hongyuw1 Date: Thu, 28 Nov 2019 12:49:04 +0000 Subject: [PATCH] Add abs pattern to handle {si,di} mode abs to avoid pmax/cmove conversion. gcc/ PR target/92651 * config/i386/i386.h (TARGET_USE_SIMPLE_ABS_PATTERN): New macro. * config/i386/x86-tune.def (X86_TUNE_USE_SIMPLE_ABS_PATTERN): New. * config/i386/i386.md (abs2): New define_expand. gcc/testsuite * gcc.target/i386/pr92651.c: New testcase. --- gcc/config/i386/i386.h | 2 ++ gcc/config/i386/i386.md | 39 +++++++++++++++++++++++++ gcc/config/i386/x86-tune.def | 7 +++++ gcc/testsuite/gcc.target/i386/pr92651.c | 16 ++++++++++ 4 files changed, 64 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/pr92651.c diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 108456b14d4..ea27526e42e 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -596,6 +596,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST]; ix86_tune_features[X86_TUNE_USE_XCHG_FOR_ATOMIC_STORE] #define TARGET_EMIT_VZEROUPPER \ ix86_tune_features[X86_TUNE_EMIT_VZEROUPPER] +#define TARGET_USE_SIMPLE_ABS_PATTERN \ + ix86_tune_features[X86_TUNE_USE_SIMPLE_ABS_PATTERN] /* Feature tests against the various architecture variations. */ enum ix86_arch_indices { diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 7ff5872ba43..8e5059aedec 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -9668,6 +9668,45 @@ "#" [(set_attr "isa" "noavx,noavx,avx,avx")]) +;; Special expand pattern to handle integer mode abs + +(define_expand "abs2" + [(set (match_operand:SWI48x 0 "register_operand") + (abs:SWI48x + (match_operand:SWI48x 1 "register_operand")))] + "TARGET_USE_SIMPLE_ABS_PATTERN" + { + machine_mode mode = mode; + + /* Generate rtx abs using abs (x) = (((signed) x >> (W-1)) ^ x) - + ((signed) x >> (W-1)) */ + rtx shift_amount = gen_int_shift_amount (mode, + GET_MODE_PRECISION (mode) + - 1); + shift_amount = convert_modes (E_QImode, GET_MODE (shift_amount), + shift_amount, 1); + rtx shift_dst = gen_reg_rtx (mode); + rtx shift_op = gen_rtx_SET (shift_dst, + gen_rtx_fmt_ee (ASHIFTRT, mode, + operands[1], shift_amount)); + rtx clobber = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, + FLAGS_REG)); + emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, shift_op, + clobber))); + + rtx xor_op = gen_rtx_SET (operands[0], + gen_rtx_fmt_ee (XOR, mode, shift_dst, + operands[1])); + emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, xor_op, clobber))); + + rtx minus_op = gen_rtx_SET (operands[0], + gen_rtx_fmt_ee (MINUS, mode, + operands[0], shift_dst)); + emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, minus_op, + clobber))); + DONE; + }) + (define_expand "2" [(set (match_operand:X87MODEF 0 "register_operand") (absneg:X87MODEF (match_operand:X87MODEF 1 "register_operand")))] diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def index 328535d38d7..86ff24122e6 100644 --- a/gcc/config/i386/x86-tune.def +++ b/gcc/config/i386/x86-tune.def @@ -317,6 +317,13 @@ DEF_TUNE (X86_TUNE_ONE_IF_CONV_INSN, "one_if_conv_insn", DEF_TUNE (X86_TUNE_USE_XCHG_FOR_ATOMIC_STORE, "use_xchg_for_atomic_store", m_CORE_ALL | m_BDVER | m_ZNVER | m_GENERIC) +/* X86_TUNE_USE_SIMPLE_ABS_PATTERN: This enables a new abs pattern by + generating instructions for abs (x) = (((signed) x >> (W-1) ^ x) - + (signed) x >> (W-1)) instead of cmove or SSE max/abs instructions. */ +DEF_TUNE (X86_TUNE_USE_SIMPLE_ABS_PATTERN, "use_simple_abs_pattern", + m_CORE_ALL | m_SILVERMONT | m_KNL | m_KNM | m_GOLDMONT + | m_GOLDMONT_PLUS | m_TREMONT ) + /*****************************************************************************/ /* 387 instruction selection tuning */ /*****************************************************************************/ diff --git a/gcc/testsuite/gcc.target/i386/pr92651.c b/gcc/testsuite/gcc.target/i386/pr92651.c new file mode 100644 index 00000000000..3d0c3c7bf4e --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr92651.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=corei7" } */ + +#include + +int foo(unsigned char a, unsigned char b) +{ + int isum=abs(a - b); + return isum; +} + +/* { dg-final { scan-assembler-not "cmov*" } } */ +/* { dg-final { scan-assembler "(cltd|cdq|shr)" } } */ +/* { dg-final { scan-assembler-times "xor" 1 } } */ +/* { dg-final { scan-assembler-times "sub" 2 } } */ + -- 2.19.1