Message ID | 20240923145602.3872286-1-pan2.li@intel.com |
---|---|
State | New |
Headers | show |
Series | [v1] Widening-Mul: Fix one ICE for SAT_SUB matching operand promotion | expand |
On Mon, Sep 23, 2024 at 4:58 PM <pan2.li@intel.com> wrote: > > From: Pan Li <pan2.li@intel.com> > > This patch would like to fix the following ICE for -O2 -m32 of x86_64. > > during RTL pass: expand > JackMidiAsyncWaitQueue.cpp.cpp: In function 'void DequeueEvent(unsigned > int)': > JackMidiAsyncWaitQueue.cpp.cpp:3:6: internal compiler error: in > expand_fn_using_insn, at internal-fn.cc:263 > 3 | void DequeueEvent(unsigned frame) { > | ^~~~~~~~~~~~ > 0x27b580d diagnostic_context::diagnostic_impl(rich_location*, > diagnostic_metadata const*, diagnostic_option_id, char const*, > __va_list_tag (*) [1], diagnostic_t) > ???:0 > 0x27c4a3f internal_error(char const*, ...) > ???:0 > 0x27b3994 fancy_abort(char const*, int, char const*) > ???:0 > 0xf25ae5 expand_fn_using_insn(gcall*, insn_code, unsigned int, unsigned int) > ???:0 > 0xf2a124 expand_direct_optab_fn(internal_fn, gcall*, optab_tag, unsigned int) > ???:0 > 0xf2c87c expand_SAT_SUB(internal_fn, gcall*) > ???:0 > > We allowed the operand convert when matching SAT_SUB in match.pd, to support > the zip benchmark SAT_SUB pattern. Aka, > > (convert? (minus (convert1? @0) (convert1? @1))) for below sample code. > > void test (uint16_t *x, unsigned b, unsigned n) > { > unsigned a = 0; > register uint16_t *p = x; > > do { > a = *--p; > *p = (uint16_t)(a >= b ? a - b : 0); // Truncate after .SAT_SUB > } while (--n); > } > > The pattern match for SAT_SUB itself may also act on below scalar sample > code too. > > unsigned long long GetTimeFromFrames(int); > unsigned long long GetMicroSeconds(); > > void DequeueEvent(unsigned frame) { > long long frame_time = GetTimeFromFrames(frame); > unsigned long long current_time = GetMicroSeconds(); > DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time); > } > > Aka: > > uint32_t a = (uint32_t)SAT_SUB(uint64_t, uint64_t); > > Then there will be a problem when ia32 or -m32 is given when compiling. > Because we only check the lhs (aka uint32_t) type is supported by ifn > and missed the operand (aka uint64_t). Mostly DImode is disabled for > 32 bits target like ia32 or rv32gcv, and then trigger ICE when expanding. > > The below test suites are passed for this patch. > * The rv64gcv fully regression test. > * The x86 bootstrap test. > * The x86 fully regression test. > > PR target/116814 This is not "target", but "middle-end" component. Even though the bug is exposed on x86_64 target, the fix is in the middle-end code, not in the target code. > gcc/ChangeLog: > > * tree-ssa-math-opts.cc (build_saturation_binary_arith_call): Add > ifn is_supported check for operand TREE type. > > gcc/testsuite/ChangeLog: > > * g++.dg/torture/pr116814-1.C: New test. > > Signed-off-by: Pan Li <pan2.li@intel.com> > --- > gcc/testsuite/g++.dg/torture/pr116814-1.C | 12 ++++++++++++ > gcc/tree-ssa-math-opts.cc | 23 +++++++++++++++-------- > 2 files changed, 27 insertions(+), 8 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/torture/pr116814-1.C > > diff --git a/gcc/testsuite/g++.dg/torture/pr116814-1.C b/gcc/testsuite/g++.dg/torture/pr116814-1.C > new file mode 100644 > index 00000000000..8db5b020cfd > --- /dev/null > +++ b/gcc/testsuite/g++.dg/torture/pr116814-1.C > @@ -0,0 +1,12 @@ > +/* { dg-do compile { target { i?86-*-* x86_64-*-* } } } */ > +/* { dg-options "-O2 -m32" } */ Please remove -m32 and use "{ dg-do compile { target ia32 } }" instead. Uros, > + > +unsigned long long GetTimeFromFrames(int); > +unsigned long long GetMicroSeconds(); > + > +void DequeueEvent(unsigned frame) { > + long long frame_time = GetTimeFromFrames(frame); > + unsigned long long current_time = GetMicroSeconds(); > + > + DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time); > +} > diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc > index d61668aacfc..361761cedef 100644 > --- a/gcc/tree-ssa-math-opts.cc > +++ b/gcc/tree-ssa-math-opts.cc > @@ -4042,15 +4042,22 @@ build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, gphi *phi, > internal_fn fn, tree lhs, tree op_0, > tree op_1) > { > - if (direct_internal_fn_supported_p (fn, TREE_TYPE (lhs), OPTIMIZE_FOR_BOTH)) > - { > - gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1); > - gimple_call_set_lhs (call, lhs); > - gsi_insert_before (gsi, call, GSI_SAME_STMT); > + tree lhs_type = TREE_TYPE (lhs); > + tree op_type = TREE_TYPE (op_0); > > - gimple_stmt_iterator psi = gsi_for_stmt (phi); > - remove_phi_node (&psi, /* release_lhs_p */ false); > - } > + if (!direct_internal_fn_supported_p (fn, lhs_type, OPTIMIZE_FOR_BOTH)) > + return; > + > + if (lhs_type != op_type > + && !direct_internal_fn_supported_p (fn, op_type, OPTIMIZE_FOR_BOTH)) > + return; > + > + gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1); > + gimple_call_set_lhs (call, lhs); > + gsi_insert_before (gsi, call, GSI_SAME_STMT); > + > + gimple_stmt_iterator psi = gsi_for_stmt (phi); > + remove_phi_node (&psi, /* release_lhs_p */ false); > } > > /* > -- > 2.43.0 >
Thanks Uros for comments. > This is not "target", but "middle-end" component. Even though the bug > is exposed on x86_64 target, the fix is in the middle-end code, not in > the target code. Sure, will rename to middle-end. > Please remove -m32 and use "{ dg-do compile { target ia32 } }" instead. Is there any suggestion to run the "ia32" test when configure gcc build? I first leverage ia32 but complain UNSUPPORTED for this case. Pan -----Original Message----- From: Uros Bizjak <ubizjak@gmail.com> Sent: Tuesday, September 24, 2024 2:17 PM To: Li, Pan2 <pan2.li@intel.com> Cc: gcc-patches@gcc.gnu.org; richard.guenther@gmail.com; Tamar.Christina@arm.com; juzhe.zhong@rivai.ai; kito.cheng@gmail.com; jeffreyalaw@gmail.com; rdapp.gcc@gmail.com Subject: Re: [PATCH v1] Widening-Mul: Fix one ICE for SAT_SUB matching operand promotion On Mon, Sep 23, 2024 at 4:58 PM <pan2.li@intel.com> wrote: > > From: Pan Li <pan2.li@intel.com> > > This patch would like to fix the following ICE for -O2 -m32 of x86_64. > > during RTL pass: expand > JackMidiAsyncWaitQueue.cpp.cpp: In function 'void DequeueEvent(unsigned > int)': > JackMidiAsyncWaitQueue.cpp.cpp:3:6: internal compiler error: in > expand_fn_using_insn, at internal-fn.cc:263 > 3 | void DequeueEvent(unsigned frame) { > | ^~~~~~~~~~~~ > 0x27b580d diagnostic_context::diagnostic_impl(rich_location*, > diagnostic_metadata const*, diagnostic_option_id, char const*, > __va_list_tag (*) [1], diagnostic_t) > ???:0 > 0x27c4a3f internal_error(char const*, ...) > ???:0 > 0x27b3994 fancy_abort(char const*, int, char const*) > ???:0 > 0xf25ae5 expand_fn_using_insn(gcall*, insn_code, unsigned int, unsigned int) > ???:0 > 0xf2a124 expand_direct_optab_fn(internal_fn, gcall*, optab_tag, unsigned int) > ???:0 > 0xf2c87c expand_SAT_SUB(internal_fn, gcall*) > ???:0 > > We allowed the operand convert when matching SAT_SUB in match.pd, to support > the zip benchmark SAT_SUB pattern. Aka, > > (convert? (minus (convert1? @0) (convert1? @1))) for below sample code. > > void test (uint16_t *x, unsigned b, unsigned n) > { > unsigned a = 0; > register uint16_t *p = x; > > do { > a = *--p; > *p = (uint16_t)(a >= b ? a - b : 0); // Truncate after .SAT_SUB > } while (--n); > } > > The pattern match for SAT_SUB itself may also act on below scalar sample > code too. > > unsigned long long GetTimeFromFrames(int); > unsigned long long GetMicroSeconds(); > > void DequeueEvent(unsigned frame) { > long long frame_time = GetTimeFromFrames(frame); > unsigned long long current_time = GetMicroSeconds(); > DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time); > } > > Aka: > > uint32_t a = (uint32_t)SAT_SUB(uint64_t, uint64_t); > > Then there will be a problem when ia32 or -m32 is given when compiling. > Because we only check the lhs (aka uint32_t) type is supported by ifn > and missed the operand (aka uint64_t). Mostly DImode is disabled for > 32 bits target like ia32 or rv32gcv, and then trigger ICE when expanding. > > The below test suites are passed for this patch. > * The rv64gcv fully regression test. > * The x86 bootstrap test. > * The x86 fully regression test. > > PR target/116814 This is not "target", but "middle-end" component. Even though the bug is exposed on x86_64 target, the fix is in the middle-end code, not in the target code. > gcc/ChangeLog: > > * tree-ssa-math-opts.cc (build_saturation_binary_arith_call): Add > ifn is_supported check for operand TREE type. > > gcc/testsuite/ChangeLog: > > * g++.dg/torture/pr116814-1.C: New test. > > Signed-off-by: Pan Li <pan2.li@intel.com> > --- > gcc/testsuite/g++.dg/torture/pr116814-1.C | 12 ++++++++++++ > gcc/tree-ssa-math-opts.cc | 23 +++++++++++++++-------- > 2 files changed, 27 insertions(+), 8 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/torture/pr116814-1.C > > diff --git a/gcc/testsuite/g++.dg/torture/pr116814-1.C b/gcc/testsuite/g++.dg/torture/pr116814-1.C > new file mode 100644 > index 00000000000..8db5b020cfd > --- /dev/null > +++ b/gcc/testsuite/g++.dg/torture/pr116814-1.C > @@ -0,0 +1,12 @@ > +/* { dg-do compile { target { i?86-*-* x86_64-*-* } } } */ > +/* { dg-options "-O2 -m32" } */ Please remove -m32 and use "{ dg-do compile { target ia32 } }" instead. Uros, > + > +unsigned long long GetTimeFromFrames(int); > +unsigned long long GetMicroSeconds(); > + > +void DequeueEvent(unsigned frame) { > + long long frame_time = GetTimeFromFrames(frame); > + unsigned long long current_time = GetMicroSeconds(); > + > + DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time); > +} > diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc > index d61668aacfc..361761cedef 100644 > --- a/gcc/tree-ssa-math-opts.cc > +++ b/gcc/tree-ssa-math-opts.cc > @@ -4042,15 +4042,22 @@ build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, gphi *phi, > internal_fn fn, tree lhs, tree op_0, > tree op_1) > { > - if (direct_internal_fn_supported_p (fn, TREE_TYPE (lhs), OPTIMIZE_FOR_BOTH)) > - { > - gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1); > - gimple_call_set_lhs (call, lhs); > - gsi_insert_before (gsi, call, GSI_SAME_STMT); > + tree lhs_type = TREE_TYPE (lhs); > + tree op_type = TREE_TYPE (op_0); > > - gimple_stmt_iterator psi = gsi_for_stmt (phi); > - remove_phi_node (&psi, /* release_lhs_p */ false); > - } > + if (!direct_internal_fn_supported_p (fn, lhs_type, OPTIMIZE_FOR_BOTH)) > + return; > + > + if (lhs_type != op_type > + && !direct_internal_fn_supported_p (fn, op_type, OPTIMIZE_FOR_BOTH)) > + return; > + > + gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1); > + gimple_call_set_lhs (call, lhs); > + gsi_insert_before (gsi, call, GSI_SAME_STMT); > + > + gimple_stmt_iterator psi = gsi_for_stmt (phi); > + remove_phi_node (&psi, /* release_lhs_p */ false); > } > > /* > -- > 2.43.0 >
On Tue, Sep 24, 2024 at 8:24 AM Li, Pan2 <pan2.li@intel.com> wrote: > > Thanks Uros for comments. > > > This is not "target", but "middle-end" component. Even though the bug > > is exposed on x86_64 target, the fix is in the middle-end code, not in > > the target code. > > Sure, will rename to middle-end. > > > Please remove -m32 and use "{ dg-do compile { target ia32 } }" instead. > > Is there any suggestion to run the "ia32" test when configure gcc build? > I first leverage ia32 but complain UNSUPPORTED for this case. You can add the following to your testsuite run: RUNTESTFLAGS="--target-board=unix\{,-m32\}" e.g: make -j N -k check RUNTESTFLAGS=... (where N is the number of make threads) You can also add "dg.exp" or "dg.exp=pr12345.c" (or any other exp file or testcase name) to RUNTESTFLAGS to run only one exp file or a single test. Uros. > Pan > > -----Original Message----- > From: Uros Bizjak <ubizjak@gmail.com> > Sent: Tuesday, September 24, 2024 2:17 PM > To: Li, Pan2 <pan2.li@intel.com> > Cc: gcc-patches@gcc.gnu.org; richard.guenther@gmail.com; Tamar.Christina@arm.com; juzhe.zhong@rivai.ai; kito.cheng@gmail.com; jeffreyalaw@gmail.com; rdapp.gcc@gmail.com > Subject: Re: [PATCH v1] Widening-Mul: Fix one ICE for SAT_SUB matching operand promotion > > On Mon, Sep 23, 2024 at 4:58 PM <pan2.li@intel.com> wrote: > > > > From: Pan Li <pan2.li@intel.com> > > > > This patch would like to fix the following ICE for -O2 -m32 of x86_64. > > > > during RTL pass: expand > > JackMidiAsyncWaitQueue.cpp.cpp: In function 'void DequeueEvent(unsigned > > int)': > > JackMidiAsyncWaitQueue.cpp.cpp:3:6: internal compiler error: in > > expand_fn_using_insn, at internal-fn.cc:263 > > 3 | void DequeueEvent(unsigned frame) { > > | ^~~~~~~~~~~~ > > 0x27b580d diagnostic_context::diagnostic_impl(rich_location*, > > diagnostic_metadata const*, diagnostic_option_id, char const*, > > __va_list_tag (*) [1], diagnostic_t) > > ???:0 > > 0x27c4a3f internal_error(char const*, ...) > > ???:0 > > 0x27b3994 fancy_abort(char const*, int, char const*) > > ???:0 > > 0xf25ae5 expand_fn_using_insn(gcall*, insn_code, unsigned int, unsigned int) > > ???:0 > > 0xf2a124 expand_direct_optab_fn(internal_fn, gcall*, optab_tag, unsigned int) > > ???:0 > > 0xf2c87c expand_SAT_SUB(internal_fn, gcall*) > > ???:0 > > > > We allowed the operand convert when matching SAT_SUB in match.pd, to support > > the zip benchmark SAT_SUB pattern. Aka, > > > > (convert? (minus (convert1? @0) (convert1? @1))) for below sample code. > > > > void test (uint16_t *x, unsigned b, unsigned n) > > { > > unsigned a = 0; > > register uint16_t *p = x; > > > > do { > > a = *--p; > > *p = (uint16_t)(a >= b ? a - b : 0); // Truncate after .SAT_SUB > > } while (--n); > > } > > > > The pattern match for SAT_SUB itself may also act on below scalar sample > > code too. > > > > unsigned long long GetTimeFromFrames(int); > > unsigned long long GetMicroSeconds(); > > > > void DequeueEvent(unsigned frame) { > > long long frame_time = GetTimeFromFrames(frame); > > unsigned long long current_time = GetMicroSeconds(); > > DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time); > > } > > > > Aka: > > > > uint32_t a = (uint32_t)SAT_SUB(uint64_t, uint64_t); > > > > Then there will be a problem when ia32 or -m32 is given when compiling. > > Because we only check the lhs (aka uint32_t) type is supported by ifn > > and missed the operand (aka uint64_t). Mostly DImode is disabled for > > 32 bits target like ia32 or rv32gcv, and then trigger ICE when expanding. > > > > The below test suites are passed for this patch. > > * The rv64gcv fully regression test. > > * The x86 bootstrap test. > > * The x86 fully regression test. > > > > PR target/116814 > > This is not "target", but "middle-end" component. Even though the bug > is exposed on x86_64 target, the fix is in the middle-end code, not in > the target code. > > > gcc/ChangeLog: > > > > * tree-ssa-math-opts.cc (build_saturation_binary_arith_call): Add > > ifn is_supported check for operand TREE type. > > > > gcc/testsuite/ChangeLog: > > > > * g++.dg/torture/pr116814-1.C: New test. > > > > Signed-off-by: Pan Li <pan2.li@intel.com> > > --- > > gcc/testsuite/g++.dg/torture/pr116814-1.C | 12 ++++++++++++ > > gcc/tree-ssa-math-opts.cc | 23 +++++++++++++++-------- > > 2 files changed, 27 insertions(+), 8 deletions(-) > > create mode 100644 gcc/testsuite/g++.dg/torture/pr116814-1.C > > > > diff --git a/gcc/testsuite/g++.dg/torture/pr116814-1.C b/gcc/testsuite/g++.dg/torture/pr116814-1.C > > new file mode 100644 > > index 00000000000..8db5b020cfd > > --- /dev/null > > +++ b/gcc/testsuite/g++.dg/torture/pr116814-1.C > > @@ -0,0 +1,12 @@ > > +/* { dg-do compile { target { i?86-*-* x86_64-*-* } } } */ > > +/* { dg-options "-O2 -m32" } */ > > Please remove -m32 and use "{ dg-do compile { target ia32 } }" instead. > > Uros, > > > + > > +unsigned long long GetTimeFromFrames(int); > > +unsigned long long GetMicroSeconds(); > > + > > +void DequeueEvent(unsigned frame) { > > + long long frame_time = GetTimeFromFrames(frame); > > + unsigned long long current_time = GetMicroSeconds(); > > + > > + DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time); > > +} > > diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc > > index d61668aacfc..361761cedef 100644 > > --- a/gcc/tree-ssa-math-opts.cc > > +++ b/gcc/tree-ssa-math-opts.cc > > @@ -4042,15 +4042,22 @@ build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, gphi *phi, > > internal_fn fn, tree lhs, tree op_0, > > tree op_1) > > { > > - if (direct_internal_fn_supported_p (fn, TREE_TYPE (lhs), OPTIMIZE_FOR_BOTH)) > > - { > > - gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1); > > - gimple_call_set_lhs (call, lhs); > > - gsi_insert_before (gsi, call, GSI_SAME_STMT); > > + tree lhs_type = TREE_TYPE (lhs); > > + tree op_type = TREE_TYPE (op_0); > > > > - gimple_stmt_iterator psi = gsi_for_stmt (phi); > > - remove_phi_node (&psi, /* release_lhs_p */ false); > > - } > > + if (!direct_internal_fn_supported_p (fn, lhs_type, OPTIMIZE_FOR_BOTH)) > > + return; > > + > > + if (lhs_type != op_type > > + && !direct_internal_fn_supported_p (fn, op_type, OPTIMIZE_FOR_BOTH)) > > + return; > > + > > + gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1); > > + gimple_call_set_lhs (call, lhs); > > + gsi_insert_before (gsi, call, GSI_SAME_STMT); > > + > > + gimple_stmt_iterator psi = gsi_for_stmt (phi); > > + remove_phi_node (&psi, /* release_lhs_p */ false); > > } > > > > /* > > -- > > 2.43.0 > >
Got it and thanks, let me rerun to make sure it works well as expected. Pan -----Original Message----- From: Uros Bizjak <ubizjak@gmail.com> Sent: Tuesday, September 24, 2024 2:33 PM To: Li, Pan2 <pan2.li@intel.com> Cc: gcc-patches@gcc.gnu.org; richard.guenther@gmail.com; Tamar.Christina@arm.com; juzhe.zhong@rivai.ai; kito.cheng@gmail.com; jeffreyalaw@gmail.com; rdapp.gcc@gmail.com Subject: Re: [PATCH v1] Widening-Mul: Fix one ICE for SAT_SUB matching operand promotion On Tue, Sep 24, 2024 at 8:24 AM Li, Pan2 <pan2.li@intel.com> wrote: > > Thanks Uros for comments. > > > This is not "target", but "middle-end" component. Even though the bug > > is exposed on x86_64 target, the fix is in the middle-end code, not in > > the target code. > > Sure, will rename to middle-end. > > > Please remove -m32 and use "{ dg-do compile { target ia32 } }" instead. > > Is there any suggestion to run the "ia32" test when configure gcc build? > I first leverage ia32 but complain UNSUPPORTED for this case. You can add the following to your testsuite run: RUNTESTFLAGS="--target-board=unix\{,-m32\}" e.g: make -j N -k check RUNTESTFLAGS=... (where N is the number of make threads) You can also add "dg.exp" or "dg.exp=pr12345.c" (or any other exp file or testcase name) to RUNTESTFLAGS to run only one exp file or a single test. Uros. > Pan > > -----Original Message----- > From: Uros Bizjak <ubizjak@gmail.com> > Sent: Tuesday, September 24, 2024 2:17 PM > To: Li, Pan2 <pan2.li@intel.com> > Cc: gcc-patches@gcc.gnu.org; richard.guenther@gmail.com; Tamar.Christina@arm.com; juzhe.zhong@rivai.ai; kito.cheng@gmail.com; jeffreyalaw@gmail.com; rdapp.gcc@gmail.com > Subject: Re: [PATCH v1] Widening-Mul: Fix one ICE for SAT_SUB matching operand promotion > > On Mon, Sep 23, 2024 at 4:58 PM <pan2.li@intel.com> wrote: > > > > From: Pan Li <pan2.li@intel.com> > > > > This patch would like to fix the following ICE for -O2 -m32 of x86_64. > > > > during RTL pass: expand > > JackMidiAsyncWaitQueue.cpp.cpp: In function 'void DequeueEvent(unsigned > > int)': > > JackMidiAsyncWaitQueue.cpp.cpp:3:6: internal compiler error: in > > expand_fn_using_insn, at internal-fn.cc:263 > > 3 | void DequeueEvent(unsigned frame) { > > | ^~~~~~~~~~~~ > > 0x27b580d diagnostic_context::diagnostic_impl(rich_location*, > > diagnostic_metadata const*, diagnostic_option_id, char const*, > > __va_list_tag (*) [1], diagnostic_t) > > ???:0 > > 0x27c4a3f internal_error(char const*, ...) > > ???:0 > > 0x27b3994 fancy_abort(char const*, int, char const*) > > ???:0 > > 0xf25ae5 expand_fn_using_insn(gcall*, insn_code, unsigned int, unsigned int) > > ???:0 > > 0xf2a124 expand_direct_optab_fn(internal_fn, gcall*, optab_tag, unsigned int) > > ???:0 > > 0xf2c87c expand_SAT_SUB(internal_fn, gcall*) > > ???:0 > > > > We allowed the operand convert when matching SAT_SUB in match.pd, to support > > the zip benchmark SAT_SUB pattern. Aka, > > > > (convert? (minus (convert1? @0) (convert1? @1))) for below sample code. > > > > void test (uint16_t *x, unsigned b, unsigned n) > > { > > unsigned a = 0; > > register uint16_t *p = x; > > > > do { > > a = *--p; > > *p = (uint16_t)(a >= b ? a - b : 0); // Truncate after .SAT_SUB > > } while (--n); > > } > > > > The pattern match for SAT_SUB itself may also act on below scalar sample > > code too. > > > > unsigned long long GetTimeFromFrames(int); > > unsigned long long GetMicroSeconds(); > > > > void DequeueEvent(unsigned frame) { > > long long frame_time = GetTimeFromFrames(frame); > > unsigned long long current_time = GetMicroSeconds(); > > DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time); > > } > > > > Aka: > > > > uint32_t a = (uint32_t)SAT_SUB(uint64_t, uint64_t); > > > > Then there will be a problem when ia32 or -m32 is given when compiling. > > Because we only check the lhs (aka uint32_t) type is supported by ifn > > and missed the operand (aka uint64_t). Mostly DImode is disabled for > > 32 bits target like ia32 or rv32gcv, and then trigger ICE when expanding. > > > > The below test suites are passed for this patch. > > * The rv64gcv fully regression test. > > * The x86 bootstrap test. > > * The x86 fully regression test. > > > > PR target/116814 > > This is not "target", but "middle-end" component. Even though the bug > is exposed on x86_64 target, the fix is in the middle-end code, not in > the target code. > > > gcc/ChangeLog: > > > > * tree-ssa-math-opts.cc (build_saturation_binary_arith_call): Add > > ifn is_supported check for operand TREE type. > > > > gcc/testsuite/ChangeLog: > > > > * g++.dg/torture/pr116814-1.C: New test. > > > > Signed-off-by: Pan Li <pan2.li@intel.com> > > --- > > gcc/testsuite/g++.dg/torture/pr116814-1.C | 12 ++++++++++++ > > gcc/tree-ssa-math-opts.cc | 23 +++++++++++++++-------- > > 2 files changed, 27 insertions(+), 8 deletions(-) > > create mode 100644 gcc/testsuite/g++.dg/torture/pr116814-1.C > > > > diff --git a/gcc/testsuite/g++.dg/torture/pr116814-1.C b/gcc/testsuite/g++.dg/torture/pr116814-1.C > > new file mode 100644 > > index 00000000000..8db5b020cfd > > --- /dev/null > > +++ b/gcc/testsuite/g++.dg/torture/pr116814-1.C > > @@ -0,0 +1,12 @@ > > +/* { dg-do compile { target { i?86-*-* x86_64-*-* } } } */ > > +/* { dg-options "-O2 -m32" } */ > > Please remove -m32 and use "{ dg-do compile { target ia32 } }" instead. > > Uros, > > > + > > +unsigned long long GetTimeFromFrames(int); > > +unsigned long long GetMicroSeconds(); > > + > > +void DequeueEvent(unsigned frame) { > > + long long frame_time = GetTimeFromFrames(frame); > > + unsigned long long current_time = GetMicroSeconds(); > > + > > + DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time); > > +} > > diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc > > index d61668aacfc..361761cedef 100644 > > --- a/gcc/tree-ssa-math-opts.cc > > +++ b/gcc/tree-ssa-math-opts.cc > > @@ -4042,15 +4042,22 @@ build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, gphi *phi, > > internal_fn fn, tree lhs, tree op_0, > > tree op_1) > > { > > - if (direct_internal_fn_supported_p (fn, TREE_TYPE (lhs), OPTIMIZE_FOR_BOTH)) > > - { > > - gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1); > > - gimple_call_set_lhs (call, lhs); > > - gsi_insert_before (gsi, call, GSI_SAME_STMT); > > + tree lhs_type = TREE_TYPE (lhs); > > + tree op_type = TREE_TYPE (op_0); > > > > - gimple_stmt_iterator psi = gsi_for_stmt (phi); > > - remove_phi_node (&psi, /* release_lhs_p */ false); > > - } > > + if (!direct_internal_fn_supported_p (fn, lhs_type, OPTIMIZE_FOR_BOTH)) > > + return; > > + > > + if (lhs_type != op_type > > + && !direct_internal_fn_supported_p (fn, op_type, OPTIMIZE_FOR_BOTH)) > > + return; > > + > > + gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1); > > + gimple_call_set_lhs (call, lhs); > > + gsi_insert_before (gsi, call, GSI_SAME_STMT); > > + > > + gimple_stmt_iterator psi = gsi_for_stmt (phi); > > + remove_phi_node (&psi, /* release_lhs_p */ false); > > } > > > > /* > > -- > > 2.43.0 > >
On Tue, Sep 24, 2024 at 8:53 AM Li, Pan2 <pan2.li@intel.com> wrote: > > Got it and thanks, let me rerun to make sure it works well as expected. For reference, this is documented in: https://gcc.gnu.org/wiki/Testing_GCC https://gcc-newbies-guide.readthedocs.io/en/latest/working-with-the-testsuite.html https://gcc.gnu.org/install/test.html Uros.
Got it, thanks a lot. Pan -----Original Message----- From: Uros Bizjak <ubizjak@gmail.com> Sent: Tuesday, September 24, 2024 3:29 PM To: Li, Pan2 <pan2.li@intel.com> Cc: gcc-patches@gcc.gnu.org; richard.guenther@gmail.com; Tamar.Christina@arm.com; juzhe.zhong@rivai.ai; kito.cheng@gmail.com; jeffreyalaw@gmail.com; rdapp.gcc@gmail.com Subject: Re: [PATCH v1] Widening-Mul: Fix one ICE for SAT_SUB matching operand promotion On Tue, Sep 24, 2024 at 8:53 AM Li, Pan2 <pan2.li@intel.com> wrote: > > Got it and thanks, let me rerun to make sure it works well as expected. For reference, this is documented in: https://gcc.gnu.org/wiki/Testing_GCC https://gcc-newbies-guide.readthedocs.io/en/latest/working-with-the-testsuite.html https://gcc.gnu.org/install/test.html Uros.
diff --git a/gcc/testsuite/g++.dg/torture/pr116814-1.C b/gcc/testsuite/g++.dg/torture/pr116814-1.C new file mode 100644 index 00000000000..8db5b020cfd --- /dev/null +++ b/gcc/testsuite/g++.dg/torture/pr116814-1.C @@ -0,0 +1,12 @@ +/* { dg-do compile { target { i?86-*-* x86_64-*-* } } } */ +/* { dg-options "-O2 -m32" } */ + +unsigned long long GetTimeFromFrames(int); +unsigned long long GetMicroSeconds(); + +void DequeueEvent(unsigned frame) { + long long frame_time = GetTimeFromFrames(frame); + unsigned long long current_time = GetMicroSeconds(); + + DequeueEvent(frame_time < current_time ? 0 : frame_time - current_time); +} diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc index d61668aacfc..361761cedef 100644 --- a/gcc/tree-ssa-math-opts.cc +++ b/gcc/tree-ssa-math-opts.cc @@ -4042,15 +4042,22 @@ build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, gphi *phi, internal_fn fn, tree lhs, tree op_0, tree op_1) { - if (direct_internal_fn_supported_p (fn, TREE_TYPE (lhs), OPTIMIZE_FOR_BOTH)) - { - gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1); - gimple_call_set_lhs (call, lhs); - gsi_insert_before (gsi, call, GSI_SAME_STMT); + tree lhs_type = TREE_TYPE (lhs); + tree op_type = TREE_TYPE (op_0); - gimple_stmt_iterator psi = gsi_for_stmt (phi); - remove_phi_node (&psi, /* release_lhs_p */ false); - } + if (!direct_internal_fn_supported_p (fn, lhs_type, OPTIMIZE_FOR_BOTH)) + return; + + if (lhs_type != op_type + && !direct_internal_fn_supported_p (fn, op_type, OPTIMIZE_FOR_BOTH)) + return; + + gcall *call = gimple_build_call_internal (fn, 2, op_0, op_1); + gimple_call_set_lhs (call, lhs); + gsi_insert_before (gsi, call, GSI_SAME_STMT); + + gimple_stmt_iterator psi = gsi_for_stmt (phi); + remove_phi_node (&psi, /* release_lhs_p */ false); } /*