From patchwork Thu Nov 16 18:16:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 1864878 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=UqSmehHf; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SWStV6xQDz1yRM for ; Fri, 17 Nov 2023 05:17:17 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 22AB53875DF3 for ; Thu, 16 Nov 2023 18:17:15 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ed1-x532.google.com (mail-ed1-x532.google.com [IPv6:2a00:1450:4864:20::532]) by sourceware.org (Postfix) with ESMTPS id 524813858D37 for ; Thu, 16 Nov 2023 18:17:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 524813858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 524813858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::532 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700158623; cv=none; b=FJG2AfOF55YGRnXYlnCA0IhNIaVta/pBlL+lzZeEritMAWvMqziwW5dbYnVmMoEsKi7B4z8SX/FNhoe5IEkgPb0UbGpKNtMOGTmLzpW0ahTHhylZDu38pw9z6voj/bOSMG3h5zSCsjkxp87hCLHMmjHGizmvPyPtSoAvP6sl/R4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700158623; c=relaxed/simple; bh=GWGIisgnFa9Rir7vVcwaVldMIja8RKmC7GTNKgL84UY=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=VB5AZeCSlXFdDRs1thstAOndZLRFUsEKucZOWOz7A0mofp74Wc8vVc8YHY7tdFC4EJ3/c4GeuXMcphf9LliNQh3wpfGk1Zo4hgsv47JOAIHrs2DJe/Tj0MAaB/6c7Q/VqZqV9dvH7FMt+6KRFBfO0nBQpBEVJpL+gZpGJ6SNp98= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ed1-x532.google.com with SMTP id 4fb4d7f45d1cf-5446c9f3a77so1766155a12.0 for ; Thu, 16 Nov 2023 10:17:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700158619; x=1700763419; darn=gcc.gnu.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=HHsBoJ9EsoLMyuaRIvxP/4NjyrTreeXZ/mUEpq+dSzs=; b=UqSmehHffzibIdWKdE4H8ioM/P6sMHHYClPerAN5KbEfYNQS6TObcwmZMQ9r60a0OG ahWhSrRxCPylwuAOOgHHAfJKEpSkC5YmW4gtWzfWsSSRraJtM3BVTiKQDGBCHu9srPTU iE6MAD6eIYzqnMjAXtDkZWQj3jT4jCTaOj2wn8axnKm9pe1d82Exutgxw3iDzsLCUlJB ZrxRHunjj7jql6QFJeE4mPMYK/DEuEEWZouer8r7chHMR5h0tgE+2iFX+gtsiJhTFc5L rQqeDc8f774k0/JCJyCa4J59spI1CvM2wlEAg7NWC8ORPVgYZH1F5Mk+F+U3bl/Gl1bT KmHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700158619; x=1700763419; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=HHsBoJ9EsoLMyuaRIvxP/4NjyrTreeXZ/mUEpq+dSzs=; b=CiEeOU24aPORp3Kaa+eBibr9h+RT4Sj1bcyLvhLpqw1Xig8rTpezzlcQkbkpG623eV 4UnfqyKCnIqYqHltIyEznESohXkYxxzkH6aMNYMYRf3RW/pNniVWZDxtg1svptzSTjzc BZ5eIcrilruyOvp+aPsKo+I32Tl3C7SSGC5Jj2aROUbZMkLFksru4Br+DVxhDHNowAJL jpMZEHOA+9YCwim/jaZyLLSM+wSeUgdfDXqUmWgvm9n5kbFdGRYJdnzfQMucP621he+1 pott4/ptbkq4gTfnHL7L7WaF6wGoTFSCYILdidZCfdAsRopsl8uN15X9To3idCbuK8wK gFbQ== X-Gm-Message-State: AOJu0Yw9DW2Sr8J9sMR5A3QMRS6AYgD6qzw0tkTyakC0HMGJ4EsfzaFy dy6fJFF15IIZBmS1xOpguYamX+SowE1XiQKVXggVupBfSjv+7A== X-Google-Smtp-Source: AGHT+IF0POZZdfbqBSmneEKNdelh5mp3Sw03+aQX3WN29IxBsnTP6lWGFjfCPD7jixTXN9meASPhRgoNPwCjWwOnOWY= X-Received: by 2002:aa7:d316:0:b0:545:4bf3:ac89 with SMTP id p22-20020aa7d316000000b005454bf3ac89mr13255653edq.23.1700158618691; Thu, 16 Nov 2023 10:16:58 -0800 (PST) MIME-Version: 1.0 From: Uros Bizjak Date: Thu, 16 Nov 2023 19:16:47 +0100 Message-ID: Subject: [committed] i386: Optimize QImode insn with high input registers To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-8.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sometimes the compiler emits the following code with qi_ext_0: shrl $8, %eax addb %bh, %al Patch introduces new low part QImode insn patterns with both of their input arguments extracted from high register. This invalid insn is split after reload to a move from the high register and qi_ext_0 instruction. The combine pass is able to convert shift to zero/sign-extract sub-RTX, which we split to the optimal: movzbl %bh, %edx addb %ah, %dl PR target/78904 gcc/ChangeLog: * config/i386/i386.md (*addqi_ext2_0): New define_insn_and_split pattern. (*subqi_ext2_0): Ditto. (*qi_ext2_0): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr78904-10.c: New test. * gcc.target/i386/pr78904-10a.c: New test. * gcc.target/i386/pr78904-10b.c: New test. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Uros. diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index f5407ab3054..1b5a794b9e5 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -7069,6 +7069,39 @@ (define_insn "*addqi_ext_0" (set_attr "type" "alu") (set_attr "mode" "QI")]) +(define_insn_and_split "*addqi_ext2_0" + [(set (match_operand:QI 0 "register_operand" "=&Q") + (plus:QI + (subreg:QI + (match_operator:SWI248 3 "extract_operator" + [(match_operand 1 "int248_register_operand" "Q") + (const_int 8) + (const_int 8)]) 0) + (subreg:QI + (match_operator:SWI248 4 "extract_operator" + [(match_operand 2 "int248_register_operand" "Q") + (const_int 8) + (const_int 8)]) 0))) + (clobber (reg:CC FLAGS_REG))] + "" + "#" + "&& reload_completed" + [(set (match_dup 0) + (subreg:QI + (match_op_dup 4 + [(match_dup 2) (const_int 8) (const_int 8)]) 0)) + (parallel + [(set (match_dup 0) + (plus:QI + (subreg:QI + (match_op_dup 3 + [(match_dup 1) (const_int 8) (const_int 8)]) 0) + (match_dup 0))) + (clobber (reg:CC FLAGS_REG))])] + "" + [(set_attr "type" "alu") + (set_attr "mode" "QI")]) + (define_expand "addqi_ext_1" [(parallel [(set (zero_extract:HI (match_operand:HI 0 "register_operand") @@ -7814,6 +7847,39 @@ (define_insn "*subqi_ext_0" (set_attr "type" "alu") (set_attr "mode" "QI")]) +(define_insn_and_split "*subqi_ext2_0" + [(set (match_operand:QI 0 "register_operand" "=&Q") + (minus:QI + (subreg:QI + (match_operator:SWI248 3 "extract_operator" + [(match_operand 1 "int248_register_operand" "Q") + (const_int 8) + (const_int 8)]) 0) + (subreg:QI + (match_operator:SWI248 4 "extract_operator" + [(match_operand 2 "int248_register_operand" "Q") + (const_int 8) + (const_int 8)]) 0))) + (clobber (reg:CC FLAGS_REG))] + "" + "#" + "&& reload_completed" + [(set (match_dup 0) + (subreg:QI + (match_op_dup 3 + [(match_dup 1) (const_int 8) (const_int 8)]) 0)) + (parallel + [(set (match_dup 0) + (minus:QI + (match_dup 0) + (subreg:QI + (match_op_dup 4 + [(match_dup 2) (const_int 8) (const_int 8)]) 0))) + (clobber (reg:CC FLAGS_REG))])] + "" + [(set_attr "type" "alu") + (set_attr "mode" "QI")]) + ;; Alternative 1 is needed to work around LRA limitation, see PR82524. (define_insn_and_split "*subqi_ext_1" [(set (zero_extract:SWI248 @@ -11815,6 +11881,39 @@ (define_insn "*qi_ext_0" (set_attr "type" "alu") (set_attr "mode" "QI")]) +(define_insn_and_split "*qi_ext2_0" + [(set (match_operand:QI 0 "register_operand" "=&Q") + (any_logic:QI + (subreg:QI + (match_operator:SWI248 3 "extract_operator" + [(match_operand 1 "int248_register_operand" "Q") + (const_int 8) + (const_int 8)]) 0) + (subreg:QI + (match_operator:SWI248 4 "extract_operator" + [(match_operand 2 "int248_register_operand" "Q") + (const_int 8) + (const_int 8)]) 0))) + (clobber (reg:CC FLAGS_REG))] + "" + "#" + "&& reload_completed" + [(set (match_dup 0) + (subreg:QI + (match_op_dup 4 + [(match_dup 2) (const_int 8) (const_int 8)]) 0)) + (parallel + [(set (match_dup 0) + (any_logic:QI + (subreg:QI + (match_op_dup 3 + [(match_dup 1) (const_int 8) (const_int 8)]) 0) + (match_dup 0))) + (clobber (reg:CC FLAGS_REG))])] + "" + [(set_attr "type" "alu") + (set_attr "mode" "QI")]) + (define_expand "andqi_ext_1" [(parallel [(set (zero_extract:HI (match_operand:HI 0 "register_operand") diff --git a/gcc/testsuite/gcc.target/i386/pr78904-10.c b/gcc/testsuite/gcc.target/i386/pr78904-10.c new file mode 100644 index 00000000000..079629150df --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr78904-10.c @@ -0,0 +1,47 @@ +/* PR target/78904 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -masm=att" } */ +/* { dg-additional-options "-mregparm=3" { target ia32 } } */ +/* { dg-final { scan-assembler-not "shr" } } */ + +struct S1 +{ + unsigned char pad1; + unsigned char val; + unsigned short pad2; +}; + +char test_and (struct S1 a, struct S1 b) +{ + return a.val & b.val; +} + +/* { dg-final { scan-assembler "\[ \t\]andb" } } */ + +char test_or (struct S1 a, struct S1 b) +{ + return a.val | b.val; +} + +/* { dg-final { scan-assembler "\[ \t\]orb" } } */ + +char test_xor (struct S1 a, struct S1 b) +{ + return a.val ^ b.val; +} + +/* { dg-final { scan-assembler "\[ \t\]xorb" } } */ + +char test_add (struct S1 a, struct S1 b) +{ + return a.val + b.val; +} + +/* { dg-final { scan-assembler "\[ \t\]addb" } } */ + +char test_sub (struct S1 a, struct S1 b) +{ + return a.val - b.val; +} + +/* { dg-final { scan-assembler "\[ \t\]subb" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr78904-10a.c b/gcc/testsuite/gcc.target/i386/pr78904-10a.c new file mode 100644 index 00000000000..101402867b0 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr78904-10a.c @@ -0,0 +1,46 @@ +/* PR target/78904 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -masm=att" } */ +/* { dg-additional-options "-mregparm=3" { target ia32 } } */ +/* { dg-final { scan-assembler-not "shr" } } */ + +struct S1 +{ + unsigned char pad1; + unsigned char val; +}; + +char test_and (struct S1 a, struct S1 b) +{ + return a.val & b.val; +} + +/* { dg-final { scan-assembler "\[ \t\]andb" } } */ + +char test_or (struct S1 a, struct S1 b) +{ + return a.val | b.val; +} + +/* { dg-final { scan-assembler "\[ \t\]orb" } } */ + +char test_xor (struct S1 a, struct S1 b) +{ + return a.val ^ b.val; +} + +/* { dg-final { scan-assembler "\[ \t\]xorb" } } */ + +char test_add (struct S1 a, struct S1 b) +{ + return a.val + b.val; +} + +/* { dg-final { scan-assembler "\[ \t\]addb" } } */ + +char test_sub (struct S1 a, struct S1 b) +{ + return a.val - b.val; +} + +/* { dg-final { scan-assembler "\[ \t\]subb" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr78904-10b.c b/gcc/testsuite/gcc.target/i386/pr78904-10b.c new file mode 100644 index 00000000000..376acf81962 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr78904-10b.c @@ -0,0 +1,47 @@ +/* PR target/78904 */ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -masm=att" } */ +/* { dg-final { scan-assembler-not "shr" } } */ + +struct S1 +{ + unsigned char pad1; + unsigned char val; + unsigned short pad2; + unsigned int pad3; +}; + +char test_and (struct S1 a, struct S1 b) +{ + return a.val & b.val; +} + +/* { dg-final { scan-assembler "\[ \t\]andb" } } */ + +char test_or (struct S1 a, struct S1 b) +{ + return a.val | b.val; +} + +/* { dg-final { scan-assembler "\[ \t\]orb" } } */ + +char test_xor (struct S1 a, struct S1 b) +{ + return a.val ^ b.val; +} + +/* { dg-final { scan-assembler "\[ \t\]xorb" } } */ + +char test_add (struct S1 a, struct S1 b) +{ + return a.val + b.val; +} + +/* { dg-final { scan-assembler "\[ \t\]addb" } } */ + +char test_sub (struct S1 a, struct S1 b) +{ + return a.val - b.val; +} + +/* { dg-final { scan-assembler "\[ \t\]subb" } } */