Message ID | Zegk0SwZ8KtO1E8V@tucnak |
---|---|
State | New |
Headers | show |
Series | i386: Fix up the vzeroupper REG_DEAD/REG_UNUSED note workaround [PR114190] | expand |
On Wed, Mar 6, 2024 at 9:10 AM Jakub Jelinek <jakub@redhat.com> wrote: > > Hi! > > When writing the rest_of_handle_insert_vzeroupper workaround to manually > remove all the REG_DEAD/REG_UNUSED notes from the IL, I've missed that > there is a df_analyze () call right after it and that the problems added > earlier in the pass, like df_note_add_problem () done during mode switching, > doesn't affect just the next df_analyze () call right after it, but all > other df_analyze () calls until the end of the current pass where > df_finish_pass removes the optional problems. > > So, as can be seen on the following patch, the workaround doesn't actually > work there, because while rest_of_handle_insert_vzeroupper carefully removes > all REG_DEAD/REG_UNUSED notes, the df_analyze () call at the end of the > function immediately adds them in again (so, I must say I have no idea > why the workaround worked on the earlier testcases). > > Now, I could move the df_analyze () call just before the REG_DEAD/REG_UNUSED > note removal loop, but I think the following patch is better, because > the df_analyze () call doesn't have to recompute the problem when we don't > care about it and will actively strip all traces of it away. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2024-03-06 Jakub Jelinek <jakub@redhat.com> > > PR rtl-optimization/114190 > * config/i386/i386-features.cc (rest_of_handle_insert_vzeroupper): > Call df_remove_problem for df_note before calling df_analyze. > > * gcc.target/i386/avx-pr114190.c: New test. OK. Thanks, Uros. > > --- gcc/config/i386/i386-features.cc.jj 2024-02-22 10:10:18.658032517 +0100 > +++ gcc/config/i386/i386-features.cc 2024-03-05 09:23:54.496112264 +0100 > @@ -2690,6 +2690,7 @@ rest_of_handle_insert_vzeroupper (void) > } > } > > + df_remove_problem (df_note); > df_analyze (); > return 0; > } > --- gcc/testsuite/gcc.target/i386/avx-pr114190.c.jj 2024-03-05 10:07:24.869454305 +0100 > +++ gcc/testsuite/gcc.target/i386/avx-pr114190.c 2024-03-05 10:06:52.870889687 +0100 > @@ -0,0 +1,27 @@ > +/* PR rtl-optimization/114190 */ > +/* { dg-do run { target avx } } */ > +/* { dg-options "-O2 -fno-dce -fharden-compares -mavx --param=max-rtl-if-conversion-unpredictable-cost=136 -mno-avx512f -Wno-psabi" } */ > + > +#include "avx-check.h" > + > +typedef unsigned char U __attribute__((vector_size (64))); > +typedef unsigned int V __attribute__((vector_size (64))); > +U u; > + > +V > +foo (V a, V b) > +{ > + u[0] = __builtin_sub_overflow (0, (int) a[0], &a[b[7] & 5]) ? -u[1] : -b[3]; > + b ^= 0 != b; > + return (V) u + (V) a + (V) b; > +} > + > +static void > +avx_test (void) > +{ > + V x = foo ((V) { 1 }, (V) { 0, 0, 0, 1 }); > + if (x[0] != -1U) > + __builtin_abort (); > + if (x[3] != -2U) > + __builtin_abort (); > +} > > Jakub >
--- gcc/config/i386/i386-features.cc.jj 2024-02-22 10:10:18.658032517 +0100 +++ gcc/config/i386/i386-features.cc 2024-03-05 09:23:54.496112264 +0100 @@ -2690,6 +2690,7 @@ rest_of_handle_insert_vzeroupper (void) } } + df_remove_problem (df_note); df_analyze (); return 0; } --- gcc/testsuite/gcc.target/i386/avx-pr114190.c.jj 2024-03-05 10:07:24.869454305 +0100 +++ gcc/testsuite/gcc.target/i386/avx-pr114190.c 2024-03-05 10:06:52.870889687 +0100 @@ -0,0 +1,27 @@ +/* PR rtl-optimization/114190 */ +/* { dg-do run { target avx } } */ +/* { dg-options "-O2 -fno-dce -fharden-compares -mavx --param=max-rtl-if-conversion-unpredictable-cost=136 -mno-avx512f -Wno-psabi" } */ + +#include "avx-check.h" + +typedef unsigned char U __attribute__((vector_size (64))); +typedef unsigned int V __attribute__((vector_size (64))); +U u; + +V +foo (V a, V b) +{ + u[0] = __builtin_sub_overflow (0, (int) a[0], &a[b[7] & 5]) ? -u[1] : -b[3]; + b ^= 0 != b; + return (V) u + (V) a + (V) b; +} + +static void +avx_test (void) +{ + V x = foo ((V) { 1 }, (V) { 0, 0, 0, 1 }); + if (x[0] != -1U) + __builtin_abort (); + if (x[3] != -2U) + __builtin_abort (); +}