Message ID | 007701daac5a$d94d0af0$8be720d0$@nextmovesoftware.com |
---|---|
State | New |
Headers | show |
Series | [x86_64] Correct insn_cost of movabsq. | expand |
On Wed, May 22, 2024 at 5:15 PM Roger Sayle <roger@nextmovesoftware.com> wrote: > > This single line patch fixes a strange quirk/glitch in i386's rtx_costs, > which considers an instruction loading a 64-bit constant to be significantly > cheaper than loading a 32-bit (or smaller) constant. > > Consider the two functions: > unsigned long long foo() { return 0x0123456789abcdefULL; } > unsigned int bar() { return 10; } > > and the corresponding lines from combine's dump file: > insn_cost 1 for #: r98:DI=0x123456789abcdef > insn_cost 4 for #: ax:SI=0xa > > The same issue can be seen in -dP assembler output. > movabsq $81985529216486895, %rax # 5 [c=1 l=10] *movdi_internal/4 > > The problem is that pattern_costs interpretation of rtx_costs contains > "return cost > 0 ? cost : COSTS_N_INSNS (1)" where a zero value (for > example a register or small immediate constant) is considered special, > and equivalent to a single instruction, but all other values are treated > as verbatim. Hence to make x86_64's 10-byte long movabsq instruction > slightly more expensive than a simple constant, rtx_costs needs to > return COSTS_N_INSNS(1)+1 and not 1. With this change, the insn_cost > of movabsq is the intended value 5: > insn_cost 5 for #: r98:DI=0x123456789abcdef > and > movabsq $81985529216486895, %rax # 5 [c=5 l=10] *movdi_internal/4 > > > [I'd originally tried fixing this by adding a ix86_insn_cost target > hook, but the testsuite is very sensitive to the costing of insns]. > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32} > with no new failures. Ok for mainline? > > > 2024-05-22 Roger Sayle <roger@nextmovesoftware.com> > > gcc/ChangeLog > * config/i386/i386.cc (ix86_rtx_costs) <case CONST_INT>: > A CONST_INT that isn't x86_64_immediate_operand requires an extra > (expensive) movabsq insn to load, so return COSTS_N_INSNS (1) + 1. 1 of 20,796 [x86_64 PATCH] Correct insn_cost of movabsq. Inbox Roger Sayle 5:15 PM (12 minutes ago) to gcc-patches, me This single line patch fixes a strange quirk/glitch in i386's rtx_costs, which considers an instruction loading a 64-bit constant to be significantly cheaper than loading a 32-bit (or smaller) constant. Consider the two functions: unsigned long long foo() { return 0x0123456789abcdefULL; } unsigned int bar() { return 10; } and the corresponding lines from combine's dump file: insn_cost 1 for #: r98:DI=0x123456789abcdef insn_cost 4 for #: ax:SI=0xa The same issue can be seen in -dP assembler output. movabsq $81985529216486895, %rax # 5 [c=1 l=10] *movdi_internal/4 The problem is that pattern_costs interpretation of rtx_costs contains "return cost > 0 ? cost : COSTS_N_INSNS (1)" where a zero value (for example a register or small immediate constant) is considered special, and equivalent to a single instruction, but all other values are treated as verbatim. Hence to make x86_64's 10-byte long movabsq instruction slightly more expensive than a simple constant, rtx_costs needs to return COSTS_N_INSNS(1)+1 and not 1. With this change, the insn_cost of movabsq is the intended value 5: insn_cost 5 for #: r98:DI=0x123456789abcdef and movabsq $81985529216486895, %rax # 5 [c=5 l=10] *movdi_internal/4 [I'd originally tried fixing this by adding a ix86_insn_cost target hook, but the testsuite is very sensitive to the costing of insns]. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-05-22 Roger Sayle <roger@nextmovesoftware.com> gcc/ChangeLog * config/i386/i386.cc (ix86_rtx_costs) <case CONST_INT>: A CONST_INT that isn't x86_64_immediate_operand requires an extra (expensive) movabsq insn to load, so return COSTS_N_INSNS (1) + 1. Thanks in advance, Roger -- One attachment • Scanned by Gmail Roger Sayle (nextmovesoftware.com), gcc-patches@gcc.gnu.org On Wed, May 22, 2024 at 5:15 PM Roger Sayle <roger@nextmovesoftware.com> wrote: > > This single line patch fixes a strange quirk/glitch in i386's rtx_costs, > which considers an instruction loading a 64-bit constant to be significantly > cheaper than loading a 32-bit (or smaller) constant. > > Consider the two functions: > unsigned long long foo() { return 0x0123456789abcdefULL; } > unsigned int bar() { return 10; } > > and the corresponding lines from combine's dump file: > insn_cost 1 for #: r98:DI=0x123456789abcdef > insn_cost 4 for #: ax:SI=0xa > > The same issue can be seen in -dP assembler output. > movabsq $81985529216486895, %rax # 5 [c=1 l=10] *movdi_internal/4 > > The problem is that pattern_costs interpretation of rtx_costs contains > "return cost > 0 ? cost : COSTS_N_INSNS (1)" where a zero value (for > example a register or small immediate constant) is considered special, > and equivalent to a single instruction, but all other values are treated > as verbatim. Hence to make x86_64's 10-byte long movabsq instruction > slightly more expensive than a simple constant, rtx_costs needs to > return COSTS_N_INSNS(1)+1 and not 1. With this change, the insn_cost > of movabsq is the intended value 5: > insn_cost 5 for #: r98:DI=0x123456789abcdef > and > movabsq $81985529216486895, %rax # 5 [c=5 l=10] *movdi_internal/4 > > > [I'd originally tried fixing this by adding a ix86_insn_cost target > hook, but the testsuite is very sensitive to the costing of insns]. > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32} > with no new failures. Ok for mainline? > > > 2024-05-22 Roger Sayle <roger@nextmovesoftware.com> > > gcc/ChangeLog > * config/i386/i386.cc (ix86_rtx_costs) <case CONST_INT>: > A CONST_INT that isn't x86_64_immediate_operand requires an extra > (expensive) movabsq insn to load, so return COSTS_N_INSNS (1) + 1. OK, with a small comment added. Thanks, Uros. > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > index b4838b7..b4a9519 100644 > --- a/gcc/config/i386/i386.cc > +++ b/gcc/config/i386/i386.cc > @@ -21569,7 +21569,7 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno, > if (x86_64_immediate_operand (x, VOIDmode)) > *total = 0; > else > - *total = 1; > + *total = COSTS_N_INSNS (1) + 1; > return true; Please add a small comment that this cost belongs to movabs.
> Am 22.05.2024 um 17:30 schrieb Uros Bizjak <ubizjak@gmail.com>: > > On Wed, May 22, 2024 at 5:15 PM Roger Sayle <roger@nextmovesoftware.com> wrote: >> >> This single line patch fixes a strange quirk/glitch in i386's rtx_costs, >> which considers an instruction loading a 64-bit constant to be significantly >> cheaper than loading a 32-bit (or smaller) constant. >> >> Consider the two functions: >> unsigned long long foo() { return 0x0123456789abcdefULL; } >> unsigned int bar() { return 10; } >> >> and the corresponding lines from combine's dump file: >> insn_cost 1 for #: r98:DI=0x123456789abcdef >> insn_cost 4 for #: ax:SI=0xa >> >> The same issue can be seen in -dP assembler output. >> movabsq $81985529216486895, %rax # 5 [c=1 l=10] *movdi_internal/4 >> >> The problem is that pattern_costs interpretation of rtx_costs contains >> "return cost > 0 ? cost : COSTS_N_INSNS (1)" where a zero value (for >> example a register or small immediate constant) is considered special, >> and equivalent to a single instruction, but all other values are treated >> as verbatim. A zero cost is interpreted as „not implemented“ and assigned a cost of 1, assuming a COSTS_N_INSNS basing. IMO a bit bogus but I didn’t dare to argue further with Segher. Richard >> Hence to make x86_64's 10-byte long movabsq instruction >> slightly more expensive than a simple constant, rtx_costs needs to >> return COSTS_N_INSNS(1)+1 and not 1. With this change, the insn_cost >> of movabsq is the intended value 5: >> insn_cost 5 for #: r98:DI=0x123456789abcdef >> and >> movabsq $81985529216486895, %rax # 5 [c=5 l=10] *movdi_internal/4 >> >> >> [I'd originally tried fixing this by adding a ix86_insn_cost target >> hook, but the testsuite is very sensitive to the costing of insns]. >> >> >> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap >> and make -k check, both with and without --target_board=unix{-m32} >> with no new failures. Ok for mainline? >> >> >> 2024-05-22 Roger Sayle <roger@nextmovesoftware.com> >> >> gcc/ChangeLog >> * config/i386/i386.cc (ix86_rtx_costs) <case CONST_INT>: >> A CONST_INT that isn't x86_64_immediate_operand requires an extra >> (expensive) movabsq insn to load, so return COSTS_N_INSNS (1) + 1. > > 1 of 20,796 > > [x86_64 PATCH] Correct insn_cost of movabsq. > > Inbox > > Roger Sayle > > 5:15 PM (12 minutes ago) > > > to gcc-patches, me > This single line patch fixes a strange quirk/glitch in i386's rtx_costs, > which considers an instruction loading a 64-bit constant to be significantly > cheaper than loading a 32-bit (or smaller) constant. > > Consider the two functions: > unsigned long long foo() { return 0x0123456789abcdefULL; } > unsigned int bar() { return 10; } > > and the corresponding lines from combine's dump file: > insn_cost 1 for #: r98:DI=0x123456789abcdef > insn_cost 4 for #: ax:SI=0xa > > The same issue can be seen in -dP assembler output. > movabsq $81985529216486895, %rax # 5 [c=1 l=10] *movdi_internal/4 > > The problem is that pattern_costs interpretation of rtx_costs contains > "return cost > 0 ? cost : COSTS_N_INSNS (1)" where a zero value (for > example a register or small immediate constant) is considered special, > and equivalent to a single instruction, but all other values are treated > as verbatim. Hence to make x86_64's 10-byte long movabsq instruction > slightly more expensive than a simple constant, rtx_costs needs to > return COSTS_N_INSNS(1)+1 and not 1. With this change, the insn_cost > of movabsq is the intended value 5: > insn_cost 5 for #: r98:DI=0x123456789abcdef > and > movabsq $81985529216486895, %rax # 5 [c=5 l=10] *movdi_internal/4 > > > [I'd originally tried fixing this by adding a ix86_insn_cost target > hook, but the testsuite is very sensitive to the costing of insns]. > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32} > with no new failures. Ok for mainline? > > > 2024-05-22 Roger Sayle <roger@nextmovesoftware.com> > > gcc/ChangeLog > * config/i386/i386.cc (ix86_rtx_costs) <case CONST_INT>: > A CONST_INT that isn't x86_64_immediate_operand requires an extra > (expensive) movabsq insn to load, so return COSTS_N_INSNS (1) + 1. > > > Thanks in advance, > Roger > -- > > > One attachment • Scanned by Gmail > > > Roger Sayle (nextmovesoftware.com), gcc-patches@gcc.gnu.org > > >> On Wed, May 22, 2024 at 5:15 PM Roger Sayle <roger@nextmovesoftware.com> wrote: >> >> This single line patch fixes a strange quirk/glitch in i386's rtx_costs, >> which considers an instruction loading a 64-bit constant to be significantly >> cheaper than loading a 32-bit (or smaller) constant. >> >> Consider the two functions: >> unsigned long long foo() { return 0x0123456789abcdefULL; } >> unsigned int bar() { return 10; } >> >> and the corresponding lines from combine's dump file: >> insn_cost 1 for #: r98:DI=0x123456789abcdef >> insn_cost 4 for #: ax:SI=0xa >> >> The same issue can be seen in -dP assembler output. >> movabsq $81985529216486895, %rax # 5 [c=1 l=10] *movdi_internal/4 >> >> The problem is that pattern_costs interpretation of rtx_costs contains >> "return cost > 0 ? cost : COSTS_N_INSNS (1)" where a zero value (for >> example a register or small immediate constant) is considered special, >> and equivalent to a single instruction, but all other values are treated >> as verbatim. Hence to make x86_64's 10-byte long movabsq instruction >> slightly more expensive than a simple constant, rtx_costs needs to >> return COSTS_N_INSNS(1)+1 and not 1. With this change, the insn_cost >> of movabsq is the intended value 5: >> insn_cost 5 for #: r98:DI=0x123456789abcdef >> and >> movabsq $81985529216486895, %rax # 5 [c=5 l=10] *movdi_internal/4 >> >> >> [I'd originally tried fixing this by adding a ix86_insn_cost target >> hook, but the testsuite is very sensitive to the costing of insns]. >> >> >> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap >> and make -k check, both with and without --target_board=unix{-m32} >> with no new failures. Ok for mainline? >> >> >> 2024-05-22 Roger Sayle <roger@nextmovesoftware.com> >> >> gcc/ChangeLog >> * config/i386/i386.cc (ix86_rtx_costs) <case CONST_INT>: >> A CONST_INT that isn't x86_64_immediate_operand requires an extra >> (expensive) movabsq insn to load, so return COSTS_N_INSNS (1) + 1. > > OK, with a small comment added. > > Thanks, > Uros. > >> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc >> index b4838b7..b4a9519 100644 >> --- a/gcc/config/i386/i386.cc >> +++ b/gcc/config/i386/i386.cc >> @@ -21569,7 +21569,7 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno, >> if (x86_64_immediate_operand (x, VOIDmode)) >> *total = 0; >> else >> - *total = 1; >> + *total = COSTS_N_INSNS (1) + 1; >> return true; > > Please add a small comment that this cost belongs to movabs.
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index b4838b7..b4a9519 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -21569,7 +21569,7 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno, if (x86_64_immediate_operand (x, VOIDmode)) *total = 0; else - *total = 1; + *total = COSTS_N_INSNS (1) + 1; return true; case CONST_DOUBLE: