From patchwork Tue May 31 15:00:14 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yuri Rumyantsev X-Patchwork-Id: 628276 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3rJxWG1KbSz9sDC for ; Wed, 1 Jun 2016 01:00:42 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=xqPnHFhV; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:content-type; q= dns; s=default; b=XPZ9eXGhsdI9s/CVEbwkDzjFSaRKZic2Moz8FmAVS2YHdH U1rzvqZYIBk8CKC3Xaru+jeuVAW+uDHVRaXzRskrhhgXZuzjeda30fvgmP/WdElX bicLfyd9tVuSCV8NyEGThwqzvaYP33dNrLb5ei6xOuJ3LlPpCIRglQ4Cv+TpE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:content-type; s= default; bh=BPageq9uxn20YwJcxO0ZhvXHd18=; b=xqPnHFhV6ORquzS5Qqnr +BZmw51u+UX89BFGZ2RVHeDM20JmG0gsjB5ihsdbOX+llaBiq2IfU/m87Ru75vek MQS+UYWELVdg5UnHyZJKCKj3oc2ItHsDGff223bBdLTXGCDalNsvmkw+CGm90Bjk 2YMf2xcSzIsNnGJPYWV1RVs= Received: (qmail 30374 invoked by alias); 31 May 2016 15:00:33 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 30328 invoked by uid 89); 31 May 2016 15:00:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=ccc X-HELO: mail-vk0-f41.google.com Received: from mail-vk0-f41.google.com (HELO mail-vk0-f41.google.com) (209.85.213.41) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Tue, 31 May 2016 15:00:20 +0000 Received: by mail-vk0-f41.google.com with SMTP id c189so263718332vkb.1 for ; Tue, 31 May 2016 08:00:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to; bh=IX64zyn7uqNs2T8+cN2v5ap5Uxl13hCx/biDLsQBBnI=; b=Utsq/F90pGhR+nt39E1jvjd0J5HLF+7b/MHG9xNbhxraS24EAWM1BErLUD/DXOAZAg XykQQE2QSsHuvoSpzyMSGMQ75uIFZ/6z+QBY0xafI9ElYpZda53zStqsjILSULcF3avz 5Npa+bO4knWKfJgHp6faD+pcfZscJIlqwIg8MDX4J+YcaYfHnIr4sf0PnzvxWX14cdAy ql8tx5B9fk7ymMGt/fqLQk/BobT8ewZDZ1V0xa1vuCqneua2kY7vEjUYeTrorJ66IzGm tTGYjYeeZB24WNyhvmLsHO5yqb3OPPp0f9XcXAhcpxyTI6wOwOBXgQbY5sFSw8VmXPe8 ck/Q== X-Gm-Message-State: ALyK8tJyrzDQKyUqjuFglLdIyr9sfHJNillDsN3aVp6sSEi3631Ts+yivJzcRK2It4aO3Yby9U6n3PPliC/YtQ== MIME-Version: 1.0 X-Received: by 10.176.1.15 with SMTP id 15mr17097005uak.123.1464706814225; Tue, 31 May 2016 08:00:14 -0700 (PDT) Received: by 10.176.3.112 with HTTP; Tue, 31 May 2016 08:00:14 -0700 (PDT) Date: Tue, 31 May 2016 18:00:14 +0300 Message-ID: Subject: ]PATCH][RFC] Initial patch for better performance of 64-bit math instructions in 32-bit mode on x86-64 From: Yuri Rumyantsev To: Uros Bizjak , "H.J. Lu" , gcc-patches Hi Uros, Here is initial patch to improve performance of 64-bit integer arithmetic in 32-bit mode. We discovered that gcc is significantly behind icc and clang on rsa benchmark from eembc2.0 suite. Te problem function looks like typedef unsigned long long ull; typedef unsigned long ul; ul mul_add(ul *rp, ul *ap, int num, ul w) { ul c1=0; ull t; for (;;) { { t=(ull)w * ap[0] + rp[0] + c1; rp[0]= ((ul)t)&0xffffffffL; c1= ((ul)((t)>>32))&(0xffffffffL); }; if (--num == 0) break; { t=(ull)w * ap[1] + rp[1] + c1; rp[1]= ((ul)(t))&(0xffffffffL); c1= (((ul)((t)>>32))&(0xffffffffL)); }; if (--num == 0) break; { t=(ull)w * ap[2] + rp[2] + c1; rp[2]= (((ul)(t))&(0xffffffffL)); c1= (((ul)((t)>>32))&(0xffffffffL)); }; if (--num == 0) break; { t=(ull)w * ap[3] + rp[3] + c1; rp[3]= (((ul)(t))&(0xffffffffL)); c1= (((ul)((t)>>32))&(0xffffffffL)); }; if (--num == 0) break; ap+=4; rp+=4; } return(c1); } If we apply patch below we will get +6% speed-up for rsa on Silvermont. The patch looks loke (not complete since there are other 64-bit instructions e.g. subtraction): What is your opinion? Index: i386.md =================================================================== --- i386.md (revision 236181) +++ i386.md (working copy) @@ -5439,7 +5439,7 @@ (clobber (reg:CC FLAGS_REG))] "ix86_binary_operator_ok (PLUS, mode, operands)" "#" - "reload_completed" + "1" [(parallel [(set (reg:CCC FLAGS_REG) (compare:CCC (plus:DWIH (match_dup 1) (match_dup 2))