Message ID | 00b701d86a0d$5c648400$152d8c00$@nextmovesoftware.com |
---|---|
State | New |
Headers | show |
Series | [x86] Correct ix86_rtx_cost for multi-word multiplication. | expand |
On Tue, May 17, 2022 at 6:44 PM Roger Sayle <roger@nextmovesoftware.com> wrote: > > > This is the i386 backend specific piece of my revised patch for > PR middle-end/98865, where Richard Biener has suggested that I perform > the desired transformation during RTL expansion where the backend can > control whether it is profitable to convert a multiplication into a > bit-wise AND and a negation. This works well for x86_64, but alas > exposes a latent bug with -m32, where a DImode multiplication incorrectly > appears to be cheaper than negdi2+anddi3(!?). The fix to ix86_rtx_costs > is to report that a DImode (multi-word) multiplication actually requires > three SImode multiplications and two SImode additions. This also corrects > the cost of TImode multiplication on TARGET_64BIT. > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32}, with > no new failures. This change avoids the need for a !ia32 target selector > for the upcoming new test case gcc.target/i386/pr98865.c. > Ok for mainline? > > > 2022-05-17 Roger Sayle <roger@nextmovesoftware.com> > > gcc/ChangeLog > * config/i386/i386.cc (ix86_rtx_costs) [MULT]: When mode size > is wider than word_mode, a multiplication costs three word_mode > multiplications and two word_mode additions. LGTM. Thanks, Uros. > > Thanks in advance, > Roger > -- >
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 86752a6..e8a2229 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -20634,7 +20634,17 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno, op0 = XEXP (op0, 0), mode = GET_MODE (op0); } - *total = (cost->mult_init[MODE_INDEX (mode)] + int mult_init; + // Double word multiplication requires 3 mults and 2 adds. + if (GET_MODE_SIZE (mode) > UNITS_PER_WORD) + { + mult_init = 3 * cost->mult_init[MODE_INDEX (word_mode)] + + 2 * cost->add; + nbits *= 3; + } + else mult_init = cost->mult_init[MODE_INDEX (mode)]; + + *total = (mult_init + nbits * cost->mult_bit + rtx_cost (op0, mode, outer_code, opno, speed) + rtx_cost (op1, mode, outer_code, opno, speed));