Message ID | 67552cb3-e08e-4003-a45d-9ed64bd7da43@gmail.com |
---|---|
State | New |
Headers | show |
Series | expmed: Get vec_extract element mode from insn_data, [PR112999] | expand |
Robin Dapp <rdapp.gcc@gmail.com> writes: > Hi, > > this is a bit of a follow up of the latest expmed change. > > In extract_bit_field_1 we try to get a better vector mode before > extracting from it. Better refers to the case when the requested target > mode does not equal the inner mode of the vector to extract from and we > have an equivalent tieable vector mode with a fitting inner mode. > > On riscv this triggered an ICE (PR112999) because we would take the > detour of extracting from a mask-mode vector via a vector integer mode. > One element of that mode could be subreg-punned with TImode which, in > turn, would need to be operated on in DImode chunks. > > As the backend might return the extracted value in a different mode than > the inner mode of the input vector, we might already have a mode > equivalent to the target mode. Therefore, this patch first obtains the > mode the backend uses for the particular vec_extract and then compares > it against the target mode. Only if those disagree we try to find a > better mode. Otherwise we continue with the initial one. > > Bootstrapped and regtested on x86, aarch64 and power10. Regtested > on riscv. This doesn't seem like the right condition. The mode of the operand is semantically arbitrary (as long as it has enough bits). E.g. if the pattern happens to have a HImode operand, it sounds like the problem you're describing would still fire for SImode. It looks like: FOR_EACH_MODE_FROM (new_mode, new_mode) if (known_eq (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (GET_MODE (op0))) && known_eq (GET_MODE_UNIT_SIZE (new_mode), GET_MODE_SIZE (tmode)) && targetm.vector_mode_supported_p (new_mode) && targetm.modes_tieable_p (GET_MODE (op0), new_mode)) break; should at least test whether the bitpos is a multiple of GET_MODE_UNIT_SIZE (new_mode), otherwise the new mode isn't really better. Arguably it should also test whether bitnum is equal to GET_MODE_UNIT_SIZE (new_mode). Not sure whether there'll be any fallout from that, but it seems worth trying. Thanks, Richard > > Regards > Robin > > gcc/ChangeLog: > > PR target/112999 > > * expmed.cc (extract_bit_field_1): Get vec_extract's result > element mode from insn_data and compare it to the target mode. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/autovec/pr112999.c: New test. > --- > gcc/expmed.cc | 17 +++++++++++++++-- > .../gcc.target/riscv/rvv/autovec/pr112999.c | 17 +++++++++++++++++ > 2 files changed, 32 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112999.c > > diff --git a/gcc/expmed.cc b/gcc/expmed.cc > index ed17850ff74..6fbe4d9cfaf 100644 > --- a/gcc/expmed.cc > +++ b/gcc/expmed.cc > @@ -1722,10 +1722,23 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum, > } > } > > + /* The target may prefer to return the extracted value in a different mode > + than INNERMODE. */ > + machine_mode outermode = GET_MODE (op0); > + machine_mode element_mode = GET_MODE_INNER (outermode); > + if (VECTOR_MODE_P (outermode) && !MEM_P (op0)) > + { > + enum insn_code icode > + = convert_optab_handler (vec_extract_optab, outermode, element_mode); > + > + if (icode != CODE_FOR_nothing) > + element_mode = insn_data[icode].operand[0].mode; > + } > + > /* See if we can get a better vector mode before extracting. */ > if (VECTOR_MODE_P (GET_MODE (op0)) > && !MEM_P (op0) > - && GET_MODE_INNER (GET_MODE (op0)) != tmode) > + && element_mode != tmode) > { > machine_mode new_mode; > > @@ -1755,7 +1768,7 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum, > /* Use vec_extract patterns for extracting parts of vectors whenever > available. If that fails, see whether the current modes and bitregion > give a natural subreg. */ > - machine_mode outermode = GET_MODE (op0); > + outermode = GET_MODE (op0); > if (VECTOR_MODE_P (outermode) && !MEM_P (op0)) > { > scalar_mode innermode = GET_MODE_INNER (outermode); > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112999.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112999.c > new file mode 100644 > index 00000000000..c049c5a0386 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112999.c > @@ -0,0 +1,17 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv_zvl512b -mabi=lp64d --param=riscv-autovec-lmul=m8 --param=riscv-autovec-preference=fixed-vlmax -O3 -fno-vect-cost-model -fno-tree-loop-distribute-patterns" } */ > + > +int a[1024]; > +int b[1024]; > + > +_Bool > +fn1 () > +{ > + _Bool tem; > + for (int i = 0; i < 1024; ++i) > + { > + tem = !a[i]; > + b[i] = tem; > + } > + return tem; > +}
> It looks like: > > FOR_EACH_MODE_FROM (new_mode, new_mode) > if (known_eq (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (GET_MODE (op0))) > && known_eq (GET_MODE_UNIT_SIZE (new_mode), GET_MODE_SIZE (tmode)) > && targetm.vector_mode_supported_p (new_mode) > && targetm.modes_tieable_p (GET_MODE (op0), new_mode)) > break; > > should at least test whether the bitpos is a multiple of > GET_MODE_UNIT_SIZE (new_mode), otherwise the new mode isn't really > better. Arguably it should also test whether bitnum is equal > to GET_MODE_UNIT_SIZE (new_mode). > > Not sure whether there'll be any fallout from that, but it seems > worth trying. Thanks, bootstrapped and regtested the attached v2 without fallout on x86, aarch64 and power10. Tested on riscv. Regards Robin Subject: [PATCH v2] expmed: Compare unit_precision for better mode. In extract_bit_field_1 we try to get a better vector mode before extracting from it. Better refers to the case when the requested target mode does not equal the inner mode of the vector to extract from and we have an equivalent tieable vector mode with a fitting inner mode. On riscv this triggered an ICE (PR112999) because we would take the detour of extracting from a mask-mode vector via a vector integer mode. One element of that mode could be subreg-punned with TImode which, in turn, would need to be operated on in DImode chunks. This patch adds && known_eq (bitsize, GET_MODE_UNIT_PRECISION (new_mode)) && multiple_p (bitnum, GET_MODE_UNIT_PRECISION (new_mode)) to the list of criteria for a better mode. gcc/ChangeLog: PR target/112999 * expmed.cc (extract_bit_field_1): Ensure better mode has fitting unit_precision. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr112999.c: New test. --- gcc/expmed.cc | 2 ++ .../gcc.target/riscv/rvv/autovec/pr112999.c | 17 +++++++++++++++++ 2 files changed, 19 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112999.c diff --git a/gcc/expmed.cc b/gcc/expmed.cc index d75314096b4..05331dd5d82 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -1745,6 +1745,8 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum, FOR_EACH_MODE_FROM (new_mode, new_mode) if (known_eq (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (GET_MODE (op0))) && known_eq (GET_MODE_UNIT_SIZE (new_mode), GET_MODE_SIZE (tmode)) + && known_eq (bitsize, GET_MODE_UNIT_PRECISION (new_mode)) + && multiple_p (bitnum, GET_MODE_UNIT_PRECISION (new_mode)) && targetm.vector_mode_supported_p (new_mode) && targetm.modes_tieable_p (GET_MODE (op0), new_mode)) break; diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112999.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112999.c new file mode 100644 index 00000000000..c049c5a0386 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112999.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvl512b -mabi=lp64d --param=riscv-autovec-lmul=m8 --param=riscv-autovec-preference=fixed-vlmax -O3 -fno-vect-cost-model -fno-tree-loop-distribute-patterns" } */ + +int a[1024]; +int b[1024]; + +_Bool +fn1 () +{ + _Bool tem; + for (int i = 0; i < 1024; ++i) + { + tem = !a[i]; + b[i] = tem; + } + return tem; +}
Robin Dapp <rdapp.gcc@gmail.com> writes: >> It looks like: >> >> FOR_EACH_MODE_FROM (new_mode, new_mode) >> if (known_eq (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (GET_MODE (op0))) >> && known_eq (GET_MODE_UNIT_SIZE (new_mode), GET_MODE_SIZE (tmode)) >> && targetm.vector_mode_supported_p (new_mode) >> && targetm.modes_tieable_p (GET_MODE (op0), new_mode)) >> break; >> >> should at least test whether the bitpos is a multiple of >> GET_MODE_UNIT_SIZE (new_mode), otherwise the new mode isn't really >> better. Arguably it should also test whether bitnum is equal >> to GET_MODE_UNIT_SIZE (new_mode). >> >> Not sure whether there'll be any fallout from that, but it seems >> worth trying. > > Thanks, bootstrapped and regtested the attached v2 without fallout > on x86, aarch64 and power10. Tested on riscv. Nice. No fallout on three targets seems promising. > Regards > Robin > > Subject: [PATCH v2] expmed: Compare unit_precision for better mode. > > In extract_bit_field_1 we try to get a better vector mode before > extracting from it. Better refers to the case when the requested target > mode does not equal the inner mode of the vector to extract from and we > have an equivalent tieable vector mode with a fitting inner mode. > > On riscv this triggered an ICE (PR112999) because we would take the > detour of extracting from a mask-mode vector via a vector integer mode. > One element of that mode could be subreg-punned with TImode which, in > turn, would need to be operated on in DImode chunks. > > This patch adds > > && known_eq (bitsize, GET_MODE_UNIT_PRECISION (new_mode)) > && multiple_p (bitnum, GET_MODE_UNIT_PRECISION (new_mode)) > > to the list of criteria for a better mode. > > gcc/ChangeLog: > > PR target/112999 > > * expmed.cc (extract_bit_field_1): Ensure better mode > has fitting unit_precision. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/autovec/pr112999.c: New test. OK, thanks. Richard > --- > gcc/expmed.cc | 2 ++ > .../gcc.target/riscv/rvv/autovec/pr112999.c | 17 +++++++++++++++++ > 2 files changed, 19 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112999.c > > diff --git a/gcc/expmed.cc b/gcc/expmed.cc > index d75314096b4..05331dd5d82 100644 > --- a/gcc/expmed.cc > +++ b/gcc/expmed.cc > @@ -1745,6 +1745,8 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum, > FOR_EACH_MODE_FROM (new_mode, new_mode) > if (known_eq (GET_MODE_SIZE (new_mode), GET_MODE_SIZE (GET_MODE (op0))) > && known_eq (GET_MODE_UNIT_SIZE (new_mode), GET_MODE_SIZE (tmode)) > + && known_eq (bitsize, GET_MODE_UNIT_PRECISION (new_mode)) > + && multiple_p (bitnum, GET_MODE_UNIT_PRECISION (new_mode)) > && targetm.vector_mode_supported_p (new_mode) > && targetm.modes_tieable_p (GET_MODE (op0), new_mode)) > break; > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112999.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112999.c > new file mode 100644 > index 00000000000..c049c5a0386 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112999.c > @@ -0,0 +1,17 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=rv64gcv_zvl512b -mabi=lp64d --param=riscv-autovec-lmul=m8 --param=riscv-autovec-preference=fixed-vlmax -O3 -fno-vect-cost-model -fno-tree-loop-distribute-patterns" } */ > + > +int a[1024]; > +int b[1024]; > + > +_Bool > +fn1 () > +{ > + _Bool tem; > + for (int i = 0; i < 1024; ++i) > + { > + tem = !a[i]; > + b[i] = tem; > + } > + return tem; > +}
diff --git a/gcc/expmed.cc b/gcc/expmed.cc index ed17850ff74..6fbe4d9cfaf 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -1722,10 +1722,23 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum, } } + /* The target may prefer to return the extracted value in a different mode + than INNERMODE. */ + machine_mode outermode = GET_MODE (op0); + machine_mode element_mode = GET_MODE_INNER (outermode); + if (VECTOR_MODE_P (outermode) && !MEM_P (op0)) + { + enum insn_code icode + = convert_optab_handler (vec_extract_optab, outermode, element_mode); + + if (icode != CODE_FOR_nothing) + element_mode = insn_data[icode].operand[0].mode; + } + /* See if we can get a better vector mode before extracting. */ if (VECTOR_MODE_P (GET_MODE (op0)) && !MEM_P (op0) - && GET_MODE_INNER (GET_MODE (op0)) != tmode) + && element_mode != tmode) { machine_mode new_mode; @@ -1755,7 +1768,7 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum, /* Use vec_extract patterns for extracting parts of vectors whenever available. If that fails, see whether the current modes and bitregion give a natural subreg. */ - machine_mode outermode = GET_MODE (op0); + outermode = GET_MODE (op0); if (VECTOR_MODE_P (outermode) && !MEM_P (op0)) { scalar_mode innermode = GET_MODE_INNER (outermode); diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112999.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112999.c new file mode 100644 index 00000000000..c049c5a0386 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112999.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvl512b -mabi=lp64d --param=riscv-autovec-lmul=m8 --param=riscv-autovec-preference=fixed-vlmax -O3 -fno-vect-cost-model -fno-tree-loop-distribute-patterns" } */ + +int a[1024]; +int b[1024]; + +_Bool +fn1 () +{ + _Bool tem; + for (int i = 0; i < 1024; ++i) + { + tem = !a[i]; + b[i] = tem; + } + return tem; +}