Message ID | 20240906113207.1095-1-jinma@linux.alibaba.com |
---|---|
State | New |
Headers | show |
Series | [v2] RISC-V: Fixed incorrect semantic description in DF to DI pattern in the Zfa extension on rv32. | expand |
On 9/6/24 5:32 AM, Jin Ma wrote: > In the process of DF to SI, we generally use "unsigned_fix" rather than > "truncate" for conversion. Although this has no effect in general, > unexpected ICE often occurs when precise semantic analysis is required. > > gcc/ChangeLog: > > * config/riscv/riscv.md: Change "truncate" to "unsigned_fix" for > the Zfa extension on rv32. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/zfa-fmovh-fmovp-bug.c: New test. This doesn't look correct. fmv.x.w just moves the bits from one place to another, no conversion. So I don't see how the original pattern was correct. Using truncate on an FP mode source isn't defined. But I also don't think the updated pattern is correct either. jeff
On Sat, Sep 7, 2024 at 7:08 PM Jeff Law <jeffreyalaw@gmail.com> wrote: > > > > On 9/6/24 5:32 AM, Jin Ma wrote: > > In the process of DF to SI, we generally use "unsigned_fix" rather than > > "truncate" for conversion. Although this has no effect in general, > > unexpected ICE often occurs when precise semantic analysis is required. > > > > gcc/ChangeLog: > > > > * config/riscv/riscv.md: Change "truncate" to "unsigned_fix" for > > the Zfa extension on rv32. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/riscv/zfa-fmovh-fmovp-bug.c: New test. > This doesn't look correct. > > fmv.x.w just moves the bits from one place to another, no conversion. > > So I don't see how the original pattern was correct. Using truncate on > an FP mode source isn't defined. But I also don't think the updated > pattern is correct either. > jeff Having matching pattern for these Zfa moves seems pointless because the optimization that utilizes the instructions in riscv_split_doubleword_move() uses: gen_movdfsisi3_rv32(), gen_movsidf2_low_rv32() and gen_movsidf2_high_rv32(). In the XTheadFmv pattern, we use unspec, so the pattern won't match. I think this should be done for Zfa as well.
> Having matching pattern for these Zfa moves seems pointless because > the optimization that utilizes the instructions in > riscv_split_doubleword_move() uses: > gen_movdfsisi3_rv32(), gen_movsidf2_low_rv32() and gen_movsidf2_high_rv32(). > In the XTheadFmv pattern, we use unspec, so the pattern won't match. > I think this should be done for Zfa as well. Yes, that's a very good suggestion. I will try. Thanks. Jin
On 9/8/24 3:28 PM, Christoph Müllner wrote: > On Sat, Sep 7, 2024 at 7:08 PM Jeff Law <jeffreyalaw@gmail.com> wrote: >> >> >> >> On 9/6/24 5:32 AM, Jin Ma wrote: >>> In the process of DF to SI, we generally use "unsigned_fix" rather than >>> "truncate" for conversion. Although this has no effect in general, >>> unexpected ICE often occurs when precise semantic analysis is required. >>> >>> gcc/ChangeLog: >>> >>> * config/riscv/riscv.md: Change "truncate" to "unsigned_fix" for >>> the Zfa extension on rv32. >>> >>> gcc/testsuite/ChangeLog: >>> >>> * gcc.target/riscv/zfa-fmovh-fmovp-bug.c: New test. >> This doesn't look correct. >> >> fmv.x.w just moves the bits from one place to another, no conversion. >> >> So I don't see how the original pattern was correct. Using truncate on >> an FP mode source isn't defined. But I also don't think the updated >> pattern is correct either. >> jeff > > Having matching pattern for these Zfa moves seems pointless because > the optimization that utilizes the instructions in > riscv_split_doubleword_move() uses: > gen_movdfsisi3_rv32(), gen_movsidf2_low_rv32() and gen_movsidf2_high_rv32(). > In the XTheadFmv pattern, we use unspec, so the pattern won't match. > I think this should be done for Zfa as well. But if the generated code is just moving bits, why can't we use the standard movXX patterns for the data movement? Clearly there's something about this that I'm missing. Jeff
On Tue, Sep 10, 2024 at 5:25 PM Jeff Law <jeffreyalaw@gmail.com> wrote: > > > > On 9/8/24 3:28 PM, Christoph Müllner wrote: > > On Sat, Sep 7, 2024 at 7:08 PM Jeff Law <jeffreyalaw@gmail.com> wrote: > >> > >> > >> > >> On 9/6/24 5:32 AM, Jin Ma wrote: > >>> In the process of DF to SI, we generally use "unsigned_fix" rather than > >>> "truncate" for conversion. Although this has no effect in general, > >>> unexpected ICE often occurs when precise semantic analysis is required. > >>> > >>> gcc/ChangeLog: > >>> > >>> * config/riscv/riscv.md: Change "truncate" to "unsigned_fix" for > >>> the Zfa extension on rv32. > >>> > >>> gcc/testsuite/ChangeLog: > >>> > >>> * gcc.target/riscv/zfa-fmovh-fmovp-bug.c: New test. > >> This doesn't look correct. > >> > >> fmv.x.w just moves the bits from one place to another, no conversion. > >> > >> So I don't see how the original pattern was correct. Using truncate on > >> an FP mode source isn't defined. But I also don't think the updated > >> pattern is correct either. > >> jeff > > > > Having matching pattern for these Zfa moves seems pointless because > > the optimization that utilizes the instructions in > > riscv_split_doubleword_move() uses: > > gen_movdfsisi3_rv32(), gen_movsidf2_low_rv32() and gen_movsidf2_high_rv32(). > > In the XTheadFmv pattern, we use unspec, so the pattern won't match. > > I think this should be done for Zfa as well. > But if the generated code is just moving bits, why can't we use the > standard movXX patterns for the data movement? Clearly there's > something about this that I'm missing. This is a special case for rv32 + D, where we have SI, SF and DF, but no DI. This raises the question of how to model a 2x 32-bit register <-> DF register transfer. This is currently done using movdf_hardfloat_rv32, which could either go through memory (no Zfa), or create an "artificial" 64-bit move using the constraint "zmvr", which will later be picked up by the doubleword-move splitter, which converts the move into a Zfa move via riscv_split_doubleword_move().
On 9/10/24 10:09 AM, Christoph Müllner wrote: >> But if the generated code is just moving bits, why can't we use the >> standard movXX patterns for the data movement? Clearly there's >> something about this that I'm missing. > > This is a special case for rv32 + D, where we have SI, SF and DF, > but no DI. This raises the question of how to model a 2x 32-bit > register <-> DF register transfer. This is currently done using > movdf_hardfloat_rv32, which could either go through memory (no Zfa), > or create an "artificial" 64-bit move using the constraint "zmvr", > which will later be picked up by the doubleword-move splitter, which > converts the move into a Zfa move via riscv_split_doubleword_move(). My point is we shouldn't need a special pattern if we're just doing data movement. A special pattern indicates either a misunderstanding of how this should work or some behind the scenes semantics of the instruction we're using. Normally for the case where you have 32bit GPRs but 64bit FPRs, a move from an FPR<->GPR move will likely go through memory as there often isn't a good way to read/write the halves of the FPR. Which often means defining secondary memory reloads. It's painful, but necessary. For this to go directly from FPR<->GPR there would have to be instructions which either access those halves of the FPR independently or instructions that read/write register pairs on the GPR side. So I think there's some semantic thing I must be missing WRT Zfa on rv32. Does Zfa on rv32 provide one of the two mechanisms above? jeff
On Wed, Sep 18, 2024 at 3:55 PM Jeff Law <jeffreyalaw@gmail.com> wrote: > > > > On 9/10/24 10:09 AM, Christoph Müllner wrote: > > >> But if the generated code is just moving bits, why can't we use the > >> standard movXX patterns for the data movement? Clearly there's > >> something about this that I'm missing. > > > This is a special case for rv32 + D, where we have SI, SF and DF, > > but no DI. This raises the question of how to model a 2x 32-bit > > register <-> DF register transfer. This is currently done using > > movdf_hardfloat_rv32, which could either go through memory (no Zfa), > > or create an "artificial" 64-bit move using the constraint "zmvr", > > which will later be picked up by the doubleword-move splitter, which > > converts the move into a Zfa move via riscv_split_doubleword_move(). > My point is we shouldn't need a special pattern if we're just doing data > movement. A special pattern indicates either a misunderstanding of how > this should work or some behind the scenes semantics of the instruction > we're using. > > Normally for the case where you have 32bit GPRs but 64bit FPRs, a move > from an FPR<->GPR move will likely go through memory as there often > isn't a good way to read/write the halves of the FPR. Which often means > defining secondary memory reloads. It's painful, but necessary. > > For this to go directly from FPR<->GPR there would have to be > instructions which either access those halves of the FPR independently > or instructions that read/write register pairs on the GPR side. > > So I think there's some semantic thing I must be missing WRT Zfa on > rv32. Does Zfa on rv32 provide one of the two mechanisms above? Yes: * fmvh.x.d moves bits 63:32 of an FPR into a 32-bit GPR * fmvp.d.x moves two 32-bit GPR registers into a 64-bit FPR
On 9/18/24 8:03 AM, Christoph Müllner wrote: > On Wed, Sep 18, 2024 at 3:55 PM Jeff Law <jeffreyalaw@gmail.com> wrote: >> >> >> >> On 9/10/24 10:09 AM, Christoph Müllner wrote: >> >>>> But if the generated code is just moving bits, why can't we use the >>>> standard movXX patterns for the data movement? Clearly there's >>>> something about this that I'm missing. >>>> This is a special case for rv32 + D, where we have SI, SF and DF, >>> but no DI. This raises the question of how to model a 2x 32-bit >>> register <-> DF register transfer. This is currently done using >>> movdf_hardfloat_rv32, which could either go through memory (no Zfa), >>> or create an "artificial" 64-bit move using the constraint "zmvr", >>> which will later be picked up by the doubleword-move splitter, which >>> converts the move into a Zfa move via riscv_split_doubleword_move(). >> My point is we shouldn't need a special pattern if we're just doing data >> movement. A special pattern indicates either a misunderstanding of how >> this should work or some behind the scenes semantics of the instruction >> we're using. >> >> Normally for the case where you have 32bit GPRs but 64bit FPRs, a move >> from an FPR<->GPR move will likely go through memory as there often >> isn't a good way to read/write the halves of the FPR. Which often means >> defining secondary memory reloads. It's painful, but necessary. >> >> For this to go directly from FPR<->GPR there would have to be >> instructions which either access those halves of the FPR independently >> or instructions that read/write register pairs on the GPR side. >> >> So I think there's some semantic thing I must be missing WRT Zfa on >> rv32. Does Zfa on rv32 provide one of the two mechanisms above? > > Yes: > * fmvh.x.d moves bits 63:32 of an FPR into a 32-bit GPR > * fmvp.d.x moves two 32-bit GPR registers into a 64-bit FPR Thanks for clarifying. If that's what we're using, then yea, hiding behind an unspec is reasonable. In theory we could try to improve things and expose all the underlying semantics. fmvh.x.d is a subreg high part access to the fpr. fmvp.d.x is just a standard 64bit movdi/movdf move, those exposing it as such may have some fallout. BUt I don't think either of those are necessary to tackle right now. So, no objection to the patch from me anymore. Let's go with the v2. Jeff
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 9f94b5aa0232..36d7b333c456 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -2627,7 +2627,7 @@ (define_insn "*movdf_softfloat" (define_insn "movsidf2_low_rv32" [(set (match_operand:SI 0 "register_operand" "= r") - (truncate:SI + (unsigned_fix:SI (match_operand:DF 1 "register_operand" "zmvf")))] "TARGET_HARD_FLOAT && !TARGET_64BIT && TARGET_ZFA" "fmv.x.w\t%0,%1" @@ -2638,7 +2638,7 @@ (define_insn "movsidf2_low_rv32" (define_insn "movsidf2_high_rv32" [(set (match_operand:SI 0 "register_operand" "= r") - (truncate:SI + (unsigned_fix:SI (lshiftrt:DF (match_operand:DF 1 "register_operand" "zmvf") (const_int 32))))] diff --git a/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-bug.c b/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-bug.c new file mode 100644 index 000000000000..e00047b09e3a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zfa-fmovh-fmovp-bug.c @@ -0,0 +1,9 @@ +/* Test that we do not have ice when compile */ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zfa -mabi=ilp32d -O2 -g" } */ + +unsigned int +foo (double a) { + unsigned int tt = *(unsigned long long *)&a & 0xffff; + return tt; +}