From patchwork Thu Jul 20 18:57:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 1810603 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=JkHY2gbS; dkim-atps=neutral Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4R6MQw0xb4z1yYm for ; Fri, 21 Jul 2023 04:58:31 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D617938555A4 for ; Thu, 20 Jul 2023 18:58:28 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D617938555A4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689879508; bh=GbuVNlR3Hf/H8JuHrTxYWN+1ePyK+1nwYWXNvvdpqpY=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=JkHY2gbSioo/iTyRZtHWcM/joIE4n2/f9kI3DxxxMzdkkh3W5OptZ49vk3ow0fdJq OMgL10XefpbLed5823OaU58kg7BJS1s1+VwyMjW9URBZCi9J7ppjQwZOpOJVjAGULB /wsKh/2OxmlNENpWkDw/r9Qo8xYHybIIdsg9ukR8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lf1-x132.google.com (mail-lf1-x132.google.com [IPv6:2a00:1450:4864:20::132]) by sourceware.org (Postfix) with ESMTPS id C0BCE3858CDB for ; Thu, 20 Jul 2023 18:58:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C0BCE3858CDB Received: by mail-lf1-x132.google.com with SMTP id 2adb3069b0e04-4fb761efa7aso1875335e87.0 for ; Thu, 20 Jul 2023 11:58:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689879485; x=1690484285; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=GbuVNlR3Hf/H8JuHrTxYWN+1ePyK+1nwYWXNvvdpqpY=; b=SUU0pqmZZshL56T/7PPMkTUQnHY2KztTrLgomB+ZAHpY8pNwaAjch8zUmIloSW74zP z+NXByjJgGpUAZvpx0OuwK2k6dUNWCpKF5XNdXF18Wiz7xrnkvKOFH1k4X8vkm0xsb4E 3dNmmv4vHk/ho8Qix4heXO/n3zkxydRoH4lQI9S2STAMm9sZKhg/qJ6u7jXm2s3OqaNZ Fr82axDuWiezHhbimB0TiWvk2rLh5vVOXhKA1HdxscWnEAPLWcu/lR04icZoG5MXSATf 8RhagoqZOiiVat9fIbffCs3uaaS0qWP7ocd7WEMSUJ/QsjW6AZ4cpiU3qPqjs7flWbX0 3zfg== X-Gm-Message-State: ABy/qLbbTAvQhudtkxlIoe9xSLRoiM4748C7YwOEnfhW5C+ynyGWyQr+ eDnkl8/iJp+l94iO+gT25Dxx9yin+JGr2CJbaJV/Zp05Sgui3w== X-Google-Smtp-Source: APBJJlFnDCUJ5YzWyky55+Fya9a3SKU2UOgF/ePvAumK4oeHFTYjGib7BX2tbIVApdw7JewlFTCy5PuGXTBz0FkrkPs= X-Received: by 2002:ac2:4642:0:b0:4f9:5404:af5 with SMTP id s2-20020ac24642000000b004f954040af5mr2541049lfo.46.1689879484769; Thu, 20 Jul 2023 11:58:04 -0700 (PDT) MIME-Version: 1.0 Date: Thu, 20 Jul 2023 20:57:53 +0200 Message-ID: Subject: [committed] i386: Double-word sign-extension missed-optimization [PR110717] To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Uros Bizjak via Gcc-patches From: Uros Bizjak Reply-To: Uros Bizjak Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" When sign-extending the value in a double-word register pair using shift and ashiftrt sequence with the same count immediate value less than word width, there is no need to shift the lower word of the value. The sign-extension could be limited to the upper word, but we uselessly shift the lower word with it as well: movq %rdi, %rax movq %rsi, %rdx shldq $59, %rdi, %rdx salq $59, %rax shrdq $59, %rdx, %rax sarq $59, %rdx ret for -m64 and movl 4(%esp), %eax movl 8(%esp), %edx shldl $27, %eax, %edx sall $27, %eax shrdl $27, %edx, %eax sarl $27, %edx ret for -m32. The patch introduces a new post-reload splitter to provide the combined ASHIFTRT/SHIFT instruction pattern. The instruction is split to a sequence of SAL and SAR insns with the same count immediate operand: movq %rsi, %rdx movq %rdi, %rax salq $59, %rdx sarq $59, %rdx ret Some complication is required to properly handle STV transform, where we emit a sequence with DImode PSLLQ and PSRAQ insns for 32-bit AVX512VL targets when profitable. The patch also fixes a small oversight and enables STV transform of SImode ASHIFTRT to PSRAD also for SSE2 targets. PR target/110717 gcc/ChangeLog: * config/i386/i386-features.cc (general_scalar_chain::compute_convert_gain): Calculate gain for extend higpart case. (general_scalar_chain::convert_op): Handle ASHIFTRT/ASHIFT combined RTX. (general_scalar_to_vector_candidate_p): Enable ASHIFTRT for SImode for SSE2 targets. Handle ASHIFTRT/ASHIFT combined RTX. * config/i386/i386.md (*extend2_doubleword_highpart): New define_insn_and_split pattern. (*extendv2di2_highpart_stv): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr110717.c: New test. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Uros. diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc index 4d69251d4f5..f801a8fc94a 100644 --- a/gcc/config/i386/i386-features.cc +++ b/gcc/config/i386/i386-features.cc @@ -572,6 +572,9 @@ general_scalar_chain::compute_convert_gain () { if (INTVAL (XEXP (src, 1)) >= 32) igain += ix86_cost->add; + /* Gain for extend highpart case. */ + else if (GET_CODE (XEXP (src, 0)) == ASHIFT) + igain += ix86_cost->shift_const - ix86_cost->sse_op; else igain += ix86_cost->shift_const; } @@ -951,7 +954,8 @@ general_scalar_chain::convert_op (rtx *op, rtx_insn *insn) { *op = copy_rtx_if_shared (*op); - if (GET_CODE (*op) == NOT) + if (GET_CODE (*op) == NOT + || GET_CODE (*op) == ASHIFT) { convert_op (&XEXP (*op, 0), insn); PUT_MODE (*op, vmode); @@ -2120,7 +2124,7 @@ general_scalar_to_vector_candidate_p (rtx_insn *insn, enum machine_mode mode) switch (GET_CODE (src)) { case ASHIFTRT: - if (!TARGET_AVX512VL) + if (mode == DImode && !TARGET_AVX512VL) return false; /* FALLTHRU */ @@ -2131,6 +2135,14 @@ general_scalar_to_vector_candidate_p (rtx_insn *insn, enum machine_mode mode) if (!CONST_INT_P (XEXP (src, 1)) || !IN_RANGE (INTVAL (XEXP (src, 1)), 0, GET_MODE_BITSIZE (mode)-1)) return false; + + /* Check for extend highpart case. */ + if (mode != DImode + || GET_CODE (src) != ASHIFTRT + || GET_CODE (XEXP (src, 0)) != ASHIFT) + break; + + src = XEXP (src, 0); break; case SMAX: diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 8c54aa5e981..4db210cc795 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -15292,6 +15292,41 @@ (define_insn "*qi_ext_2" (const_string "0") (const_string "*"))) (set_attr "mode" "QI")]) + +(define_insn_and_split "*extend2_doubleword_highpart" + [(set (match_operand: 0 "register_operand" "=r") + (ashiftrt: + (ashift: (match_operand: 1 "nonimmediate_operand" "0") + (match_operand:QI 2 "const_int_operand")) + (match_operand:QI 3 "const_int_operand"))) + (clobber (reg:CC FLAGS_REG))] + "INTVAL (operands[2]) == INTVAL (operands[3]) + && UINTVAL (operands[2]) < * BITS_PER_UNIT" + "#" + "&& reload_completed" + [(parallel [(set (match_dup 4) + (ashift:DWIH (match_dup 4) (match_dup 2))) + (clobber (reg:CC FLAGS_REG))]) + (parallel [(set (match_dup 4) + (ashiftrt:DWIH (match_dup 4) (match_dup 2))) + (clobber (reg:CC FLAGS_REG))])] + "split_double_mode (mode, &operands[0], 1, &operands[0], &operands[4]);") + +(define_insn_and_split "*extendv2di2_highpart_stv" + [(set (match_operand:V2DI 0 "register_operand" "=v") + (ashiftrt:V2DI + (ashift:V2DI (match_operand:V2DI 1 "nonimmediate_operand" "vm") + (match_operand:QI 2 "const_int_operand")) + (match_operand:QI 3 "const_int_operand")))] + "!TARGET_64BIT && TARGET_STV && TARGET_AVX512VL + && INTVAL (operands[2]) == INTVAL (operands[3]) + && UINTVAL (operands[2]) < 32" + "#" + "&& reload_completed" + [(set (match_dup 0) + (ashift:V2DI (match_dup 1) (match_dup 2))) + (set (match_dup 0) + (ashiftrt:V2DI (match_dup 0) (match_dup 2)))]) ;; Rotate instructions diff --git a/gcc/testsuite/gcc.target/i386/pr110717.c b/gcc/testsuite/gcc.target/i386/pr110717.c new file mode 100644 index 00000000000..233f0eae5b5 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr110717.c @@ -0,0 +1,21 @@ +/* PR target/110717 */ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#ifdef __SIZEOF_INT128__ +unsigned __int128 +foo (unsigned __int128 x) +{ + x <<= 59; + return ((__int128) x) >> 59; +} +#else +unsigned long long +foo (unsigned long long x) +{ + x <<= 27; + return ((long long) x) >> 27; +} +#endif + +/* { dg-final { scan-assembler-not "sh\[lr\]d" } } */