From patchwork Thu Jul 8 08:25:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 1502138 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=nextmovesoftware.com header.i=@nextmovesoftware.com header.a=rsa-sha256 header.s=default header.b=KX2qLMTe; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GL8W931Tkz9sWq for ; Thu, 8 Jul 2021 18:25:40 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D0313398240D for ; Thu, 8 Jul 2021 08:25:37 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 925CB3838025 for ; Thu, 8 Jul 2021 08:25:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 925CB3838025 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=GCpMrOnroOberxVJgQpKt1cW9/w9To3llXGpXUGQiRE=; b=KX2qLMTeSgdLYYTMFC0HPi+1VM 3POE2nyrWbDZdRNeGXBDZUXxBC9WUJ6OJCMqeO9T6KNJ9eQ8N+ZpGSSfp3kdgBm5/TBdVx3zTnM4K r0+hpCEzkKXdxFd06vrYvTBMbTnmjc8pkKaBeOUPfpTAaRFf/Xag8KAK5BxMbvPHb2nqLkNHzw6fR /3qYykD9k9PLdbuwMV6r+PqySGwoLwg9MVLQubFfeDwtZ/Csb3vdbEF8qJCC7xkFdZxK/ZhGsaE64 tn5TVuxYdm53PdKjaKg/ODmgI+ujhLOYC+Qu6vmkzsQO21RF5qdzS3dW1J6iRgbCBKIlXJeTfLc95 WV0DhGEA==; Received: from host86-169-60-32.range86-169.btcentralplus.com ([86.169.60.32]:57517 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1m1PLX-0005eq-29; Thu, 08 Jul 2021 04:25:23 -0400 From: "Roger Sayle" To: "'GCC Patches'" Subject: [x86_64 PATCH]: Improvement to signed division of integer constant. Date: Thu, 8 Jul 2021 09:25:21 +0100 Message-ID: <000a01d773d2$cbf50d80$63df2880$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: Addz0THncLW0LHiPTiKY3i1lzF4Y6w== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" This patch tweaks the way GCC handles 32-bit integer division on x86_64, when the numerator is constant. Currently the function int foo (int x) { return 100/x; } generates the code: foo: movl $100, %eax cltd idivl %edi ret where the sign-extension instruction "cltd" creates a long dependency chain, as it depends on the "mov" before it, and is depended upon by "idivl" after it. With this patch, GCC now matches both icc and LLVM and uses an xor instead, generating: foo: xorl %edx, %edx movl $100, %eax idivl %edi ret Microbenchmarking confirms that this is faster on Intel processors (Kaby lake), and no worse on AMD processors (Zen2), which agrees with intuition, but oddly disagrees with the llvm-mca cycle count prediction on godbolt.org. The tricky bit is that this sign-extension instruction is only produced by late (postreload) splitting, and unfortunately none of the subsequent passes (e.g. cprop_hardreg) is able to propagate and simplify its constant argument. The solution here is to introduce a define_insn_and_split that allows the constant numerator operand to be captured (by combine) and then split into an optimal form after reload. The above microbenchmarking also shows that eliminating the sign extension of negative values (using movl $-1,%edx) is also a performance improvement, as performed by icc but not by LLVM. Both the xor and movl sign-extensions are larger than cltd, so this transformation is prevented for -Os. This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-07-08 Roger Sayle gcc/ChangeLog * config/i386/i386.md (*divmodsi4_const): Optimize SImode divmod of a constant numerator with new define_insn_and_split. gcc/testsuite/ChangeLog * gcc.target/i386/divmod-9.c: New test case. Roger --- Roger Sayle NextMove Software Cambridge, UK /* { dg-do compile } */ /* { dg-options "-O2" } */ int foo (int x) { return 100/x; } int bar(int x) { return -100/x; } /* { dg-final { scan-assembler-not "(cltd|cdq)" } } */ diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 700c158..908ae33 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -8657,6 +8657,33 @@ [(set_attr "type" "idiv") (set_attr "mode" "SI")]) +;; Avoid sign-extension (using cdq) for constant numerators. +(define_insn_and_split "*divmodsi4_const" + [(set (match_operand:SI 0 "register_operand" "=&a") + (div:SI (match_operand:SI 2 "const_int_operand" "n") + (match_operand:SI 3 "nonimmediate_operand" "rm"))) + (set (match_operand:SI 1 "register_operand" "=&d") + (mod:SI (match_dup 2) (match_dup 3))) + (clobber (reg:CC FLAGS_REG))] + "!optimize_function_for_size_p (cfun)" + "#" + "reload_completed" + [(parallel [(set (match_dup 0) + (div:SI (match_dup 0) (match_dup 3))) + (set (match_dup 1) + (mod:SI (match_dup 0) (match_dup 3))) + (use (match_dup 1)) + (clobber (reg:CC FLAGS_REG))])] +{ + emit_move_insn (operands[0], operands[2]); + if (INTVAL (operands[2]) < 0) + emit_move_insn (operands[1], constm1_rtx); + else + ix86_expand_clear (operands[1]); +} + [(set_attr "type" "multi") + (set_attr "mode" "SI")]) + (define_expand "divmodqi4" [(parallel [(set (match_operand:QI 0 "register_operand") (div:QI