From patchwork Wed Jan 23 19:22:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 1030120 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-494625-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="HRSutmqh"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="M3Pe+1Cv"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43lFYG6BjKz9s3l for ; Thu, 24 Jan 2019 06:22:41 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; q= dns; s=default; b=D2bINoaCd2EycdAErQN8uljf67IHxmXbtHpLB9RnxQNJ81 D8CBlel3a/3TuO9NTWoilY5FskvGAeUHZYYAbYFHeC0uGu3TeXcvKwqC7e3DxKSz dgWegThlBoB7aOQpvXRer+6NZ+5vAVgos/VCgm5j8nEJjYO7f9zAb8LCj7frA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; s= default; bh=Fv/TDo7Ts1h6xf7Rt247IWSrltU=; b=HRSutmqhujJaHb1YNcig sAuZ6ygsapt8XmG4UaTLxT+NhwHt02trBIGBXwvd/kK2fo7CsF69KRutIbhj+dMI tk7bkXcpXf6oLfu2QCBGR73WSySm8JXdHyKJQz1VTfCeacTBsCpoGOB+8fnuVyot 1BUyW9nU+XvBWjEDSL3QoDA= Received: (qmail 37349 invoked by alias); 23 Jan 2019 19:22:34 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 37338 invoked by uid 89); 23 Jan 2019 19:22:33 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.1 required=5.0 tests=BAYES_00, FREEMAIL_FROM, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=vx, xx, 1* X-HELO: mail-it1-f174.google.com Received: from mail-it1-f174.google.com (HELO mail-it1-f174.google.com) (209.85.166.174) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 23 Jan 2019 19:22:32 +0000 Received: by mail-it1-f174.google.com with SMTP id w18so955344ite.1 for ; Wed, 23 Jan 2019 11:22:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=tJscnB6myzI30HNK95G/OGCArORnItliY+0PDoBYKjE=; b=M3Pe+1CvIlWDvlnbpIaZu0rqMhyse2EuE+a78GPLSYlz+4lIPXz08gDMlEtEb3qrDg 7TAE9Cl4v2JgkZQyY/qhxUIcSrPdNysYq7SSWm/9iBi2iHrzPGcaHcYoXXSlptrcaSo8 49k40GG0Jgi7IC/eonyz1W6eN5y1iddtvScKEc44UNSs1lS+zWHqHauhk56f7YpovQwV YXazPs/KG8F7d2CJJWJawhJ3yWYfSSjz4sqpVOa/Vavl+j9Ks7CykmFZcZwbfF8IApfw tdLmBwieeErTTgPWF3pYDzAxudIoHhk5Sahc1LthyjEGfSb67BGmuU+MLZEcQ5IoWqWt amVQ== MIME-Version: 1.0 From: Uros Bizjak Date: Wed, 23 Jan 2019 20:22:19 +0100 Message-ID: Subject: [PATCH, i386] Fix PR 88998, bad codegen with mmx instructions To: "gcc-patches@gcc.gnu.org" Attached patch adds SSE alternatives to sse2_cvtpi2pd, sse2_cvtpd2pi and sse2_cvttpd2pi to avoid MMX registers when e.g. _mm_cvtepi32_pd intrinsics is used. Without the patch, the testcase compiles to (-O2 -mavx): _Z7prepareii: vmovd %edi, %xmm1 vpinsrd $1, %esi, %xmm1, %xmm0 movdq2q %xmm0, %mm0 cvtpi2pd %mm0, %xmm0 vhaddpd %xmm0, %xmm0, %xmm0 ret while patched gcc generates: vmovd %edi, %xmm1 vpinsrd $1, %esi, %xmm1, %xmm0 vcvtdq2pd %xmm0, %xmm0 vhaddpd %xmm0, %xmm0, %xmm0 ret The later avoids transition of FPU to MMX mode. 2019-01-23 Uroš Bizjak PR target/88998 * config/i386/sse.md (sse2_cvtpi2pd): Add SSE alternatives. Disparage MMX alternative. (sse2_cvtpd2pi): Ditto. (sse2_cvttpd2pi): Ditto. testsuite/ChangeLog: 2019-01-23 Uroš Bizjak PR target/88998 * g++.target/i386/pr88998.c: New test. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Committed to mainline, will be backported to release branches. Uros. Index: config/i386/sse.md =================================================================== --- config/i386/sse.md (revision 268188) +++ config/i386/sse.md (working copy) @@ -4997,37 +4997,49 @@ ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; (define_insn "sse2_cvtpi2pd" - [(set (match_operand:V2DF 0 "register_operand" "=x,x") - (float:V2DF (match_operand:V2SI 1 "nonimmediate_operand" "y,m")))] + [(set (match_operand:V2DF 0 "register_operand" "=v,x") + (float:V2DF (match_operand:V2SI 1 "nonimmediate_operand" "vBm,?!y")))] "TARGET_SSE2" - "cvtpi2pd\t{%1, %0|%0, %1}" + "@ + %vcvtdq2pd\t{%1, %0|%0, %1} + cvtpi2pd\t{%1, %0|%0, %1}" [(set_attr "type" "ssecvt") - (set_attr "unit" "mmx,*") - (set_attr "prefix_data16" "1,*") + (set_attr "unit" "*,mmx") + (set_attr "prefix_data16" "*,1") + (set_attr "prefix" "maybe_vex,*") (set_attr "mode" "V2DF")]) (define_insn "sse2_cvtpd2pi" - [(set (match_operand:V2SI 0 "register_operand" "=y") - (unspec:V2SI [(match_operand:V2DF 1 "nonimmediate_operand" "xm")] + [(set (match_operand:V2SI 0 "register_operand" "=v,?!y") + (unspec:V2SI [(match_operand:V2DF 1 "nonimmediate_operand" "vBm,xm")] UNSPEC_FIX_NOTRUNC))] "TARGET_SSE2" - "cvtpd2pi\t{%1, %0|%0, %1}" + "@ + * return TARGET_AVX ? \"vcvtpd2dq{x}\t{%1, %0|%0, %1}\" : \"cvtpd2dq\t{%1, %0|%0, %1}\"; + cvtpd2pi\t{%1, %0|%0, %1}" [(set_attr "type" "ssecvt") - (set_attr "unit" "mmx") + (set_attr "unit" "*,mmx") + (set_attr "amdfam10_decode" "double") + (set_attr "athlon_decode" "vector") (set_attr "bdver1_decode" "double") - (set_attr "btver2_decode" "direct") - (set_attr "prefix_data16" "1") - (set_attr "mode" "DI")]) + (set_attr "prefix_data16" "*,1") + (set_attr "prefix" "maybe_vex,*") + (set_attr "mode" "TI")]) (define_insn "sse2_cvttpd2pi" - [(set (match_operand:V2SI 0 "register_operand" "=y") - (fix:V2SI (match_operand:V2DF 1 "nonimmediate_operand" "xm")))] + [(set (match_operand:V2SI 0 "register_operand" "=v,?!y") + (fix:V2SI (match_operand:V2DF 1 "nonimmediate_operand" "vBm,xm")))] "TARGET_SSE2" - "cvttpd2pi\t{%1, %0|%0, %1}" + "@ + * return TARGET_AVX ? \"vcvttpd2dq{x}\t{%1, %0|%0, %1}\" : \"cvttpd2dq\t{%1, %0|%0, %1}\"; + cvttpd2pi\t{%1, %0|%0, %1}" [(set_attr "type" "ssecvt") - (set_attr "unit" "mmx") + (set_attr "unit" "*,mmx") + (set_attr "amdfam10_decode" "double") + (set_attr "athlon_decode" "vector") (set_attr "bdver1_decode" "double") - (set_attr "prefix_data16" "1") + (set_attr "prefix_data16" "*,1") + (set_attr "prefix" "maybe_vex,*") (set_attr "mode" "TI")]) (define_insn "sse2_cvtsi2sd" Index: testsuite/g++.target/i386/pr88998.C =================================================================== --- testsuite/g++.target/i386/pr88998.C (nonexistent) +++ testsuite/g++.target/i386/pr88998.C (working copy) @@ -0,0 +1,31 @@ +// PR target/88998 +// { dg-do run { target sse2_runtime } } +// { dg-options "-O2 -msse2 -mfpmath=387" } +// { dg-require-effective-target c++11 } + +#include +#include +#include + +double +__attribute__((noinline)) +prepare (int a, int b) +{ + __m128i is = _mm_setr_epi32 (a, b, 0, 0); + __m128d ds = _mm_cvtepi32_pd (is); + return ds[0] + ds[1]; +} + +int +main (int, char **) +{ + double d = prepare (1, 2); + + std::unordered_map < int, int >m; + m.insert ({0, 0}); + m.insert ({1, 1}); + assert (m.load_factor () <= m.max_load_factor ()); + + assert (d == 3); + return 0; +}