diff mbox series

[v4,02/12] tcg/riscv: Add basic support for vector

Message ID 20240911132630.461-3-zhiwei_liu@linux.alibaba.com
State New
Headers show
Series tcg/riscv: Add support for vector | expand

Commit Message

LIU Zhiwei Sept. 11, 2024, 1:26 p.m. UTC
From: Swung0x48 <swung0x48@outlook.com>

The RISC-V vector instruction set utilizes the LMUL field to group
multiple registers, enabling variable-length vector registers. This
implementation uses only the first register number of each group while
reserving the other register numbers within the group.

In TCG, each VEC_IR can have 3 types (TCG_TYPE_V64/128/256), and the
host runtime needs to adjust LMUL based on the type to use different
register groups.

This presents challenges for TCG's register allocation. Currently, we
avoid modifying the register allocation part of TCG and only expose the
minimum number of vector registers.

For example, when the host vlen is 64 bits and type is TCG_TYPE_V256, with
LMUL equal to 4, we use 4 vector registers as one register group. We can
use a maximum of 8 register groups, but the V0 register number is reserved
as a mask register, so we can effectively use at most 7 register groups.
Moreover, when type is smaller than TCG_TYPE_V256, only 7 registers are
forced to be used. This is because TCG cannot yet dynamically constrain
registers with type; likewise, when the host vlen is 128 bits and
TCG_TYPE_V256, we can use at most 15 registers.

There is not much pressure on vector register allocation in TCG now, so
using 7 registers is feasible and will not have a major impact on code
generation.

This patch:
1. Reserves vector register 0 for use as a mask register.
2. When using register groups, reserves the additional registers within
   each group.

Signed-off-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Co-authored-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
Reviewed-by: Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
---
 tcg/riscv/tcg-target-con-str.h |   1 +
 tcg/riscv/tcg-target.c.inc     | 126 ++++++++++++++++++++++++---------
 tcg/riscv/tcg-target.h         |  78 +++++++++++---------
 tcg/riscv/tcg-target.opc.h     |  12 ++++
 4 files changed, 151 insertions(+), 66 deletions(-)
 create mode 100644 tcg/riscv/tcg-target.opc.h

Comments

Richard Henderson Sept. 11, 2024, 6:41 p.m. UTC | #1
On 9/11/24 06:26, LIU Zhiwei wrote:
> From: Swung0x48<swung0x48@outlook.com>
> 
> The RISC-V vector instruction set utilizes the LMUL field to group
> multiple registers, enabling variable-length vector registers. This
> implementation uses only the first register number of each group while
> reserving the other register numbers within the group.
> 
> In TCG, each VEC_IR can have 3 types (TCG_TYPE_V64/128/256), and the
> host runtime needs to adjust LMUL based on the type to use different
> register groups.
> 
> This presents challenges for TCG's register allocation. Currently, we
> avoid modifying the register allocation part of TCG and only expose the
> minimum number of vector registers.
> 
> For example, when the host vlen is 64 bits and type is TCG_TYPE_V256, with
> LMUL equal to 4, we use 4 vector registers as one register group. We can
> use a maximum of 8 register groups, but the V0 register number is reserved
> as a mask register, so we can effectively use at most 7 register groups.
> Moreover, when type is smaller than TCG_TYPE_V256, only 7 registers are
> forced to be used. This is because TCG cannot yet dynamically constrain
> registers with type; likewise, when the host vlen is 128 bits and
> TCG_TYPE_V256, we can use at most 15 registers.
> 
> There is not much pressure on vector register allocation in TCG now, so
> using 7 registers is feasible and will not have a major impact on code
> generation.
> 
> This patch:
> 1. Reserves vector register 0 for use as a mask register.
> 2. When using register groups, reserves the additional registers within
>     each group.
> 
> Signed-off-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
> Co-authored-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>

If there is a co-author, there should be another Signed-off-by.

> Reviewed-by: Liu Zhiwei<zhiwei_liu@linux.alibaba.com>
> ---
>   tcg/riscv/tcg-target-con-str.h |   1 +
>   tcg/riscv/tcg-target.c.inc     | 126 ++++++++++++++++++++++++---------
>   tcg/riscv/tcg-target.h         |  78 +++++++++++---------
>   tcg/riscv/tcg-target.opc.h     |  12 ++++
>   4 files changed, 151 insertions(+), 66 deletions(-)
>   create mode 100644 tcg/riscv/tcg-target.opc.h

Anyway,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>


r~
LIU Zhiwei Sept. 18, 2024, 5:17 a.m. UTC | #2
On 2024/9/12 2:41, Richard Henderson wrote:
> On 9/11/24 06:26, LIU Zhiwei wrote:
>> From: Swung0x48<swung0x48@outlook.com>
>>
>> The RISC-V vector instruction set utilizes the LMUL field to group
>> multiple registers, enabling variable-length vector registers. This
>> implementation uses only the first register number of each group while
>> reserving the other register numbers within the group.
>>
>> In TCG, each VEC_IR can have 3 types (TCG_TYPE_V64/128/256), and the
>> host runtime needs to adjust LMUL based on the type to use different
>> register groups.
>>
>> This presents challenges for TCG's register allocation. Currently, we
>> avoid modifying the register allocation part of TCG and only expose the
>> minimum number of vector registers.
>>
>> For example, when the host vlen is 64 bits and type is TCG_TYPE_V256, 
>> with
>> LMUL equal to 4, we use 4 vector registers as one register group. We can
>> use a maximum of 8 register groups, but the V0 register number is 
>> reserved
>> as a mask register, so we can effectively use at most 7 register groups.
>> Moreover, when type is smaller than TCG_TYPE_V256, only 7 registers are
>> forced to be used. This is because TCG cannot yet dynamically constrain
>> registers with type; likewise, when the host vlen is 128 bits and
>> TCG_TYPE_V256, we can use at most 15 registers.
>>
>> There is not much pressure on vector register allocation in TCG now, so
>> using 7 registers is feasible and will not have a major impact on code
>> generation.
>>
>> This patch:
>> 1. Reserves vector register 0 for use as a mask register.
>> 2. When using register groups, reserves the additional registers within
>>     each group.
>>
>> Signed-off-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
>> Co-authored-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
>
> If there is a co-author, there should be another Signed-off-by.

This patch has added a tag:

Signed-off-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>


Do you mean we should add the same tag twice?

Thanks,
Zhiwei

>
>> Reviewed-by: Liu Zhiwei<zhiwei_liu@linux.alibaba.com>
>> ---
>>   tcg/riscv/tcg-target-con-str.h |   1 +
>>   tcg/riscv/tcg-target.c.inc     | 126 ++++++++++++++++++++++++---------
>>   tcg/riscv/tcg-target.h         |  78 +++++++++++---------
>>   tcg/riscv/tcg-target.opc.h     |  12 ++++
>>   4 files changed, 151 insertions(+), 66 deletions(-)
>>   create mode 100644 tcg/riscv/tcg-target.opc.h
>
> Anyway,
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
>
>
> r~
Richard Henderson Sept. 18, 2024, 10:11 a.m. UTC | #3
On 9/18/24 07:17, LIU Zhiwei wrote:
> 
> On 2024/9/12 2:41, Richard Henderson wrote:
>> On 9/11/24 06:26, LIU Zhiwei wrote:
>>> From: Swung0x48<swung0x48@outlook.com>
>>>
>>> The RISC-V vector instruction set utilizes the LMUL field to group
>>> multiple registers, enabling variable-length vector registers. This
>>> implementation uses only the first register number of each group while
>>> reserving the other register numbers within the group.
>>>
>>> In TCG, each VEC_IR can have 3 types (TCG_TYPE_V64/128/256), and the
>>> host runtime needs to adjust LMUL based on the type to use different
>>> register groups.
>>>
>>> This presents challenges for TCG's register allocation. Currently, we
>>> avoid modifying the register allocation part of TCG and only expose the
>>> minimum number of vector registers.
>>>
>>> For example, when the host vlen is 64 bits and type is TCG_TYPE_V256, with
>>> LMUL equal to 4, we use 4 vector registers as one register group. We can
>>> use a maximum of 8 register groups, but the V0 register number is reserved
>>> as a mask register, so we can effectively use at most 7 register groups.
>>> Moreover, when type is smaller than TCG_TYPE_V256, only 7 registers are
>>> forced to be used. This is because TCG cannot yet dynamically constrain
>>> registers with type; likewise, when the host vlen is 128 bits and
>>> TCG_TYPE_V256, we can use at most 15 registers.
>>>
>>> There is not much pressure on vector register allocation in TCG now, so
>>> using 7 registers is feasible and will not have a major impact on code
>>> generation.
>>>
>>> This patch:
>>> 1. Reserves vector register 0 for use as a mask register.
>>> 2. When using register groups, reserves the additional registers within
>>>     each group.
>>>
>>> Signed-off-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
>>> Co-authored-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
>>
>> If there is a co-author, there should be another Signed-off-by.
> 
> This patch has added a tag:
> 
> Signed-off-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
> 
> 
> Do you mean we should add the same tag twice?

The from line is "Swung0x48 <swung0x48@outlook.com>".
If this is an alternate email for TANG Tiancheng, then please fix the patch --author.


r~
LIU Zhiwei Sept. 18, 2024, 10:43 a.m. UTC | #4
On 2024/9/18 18:11, Richard Henderson wrote:
> On 9/18/24 07:17, LIU Zhiwei wrote:
>>
>> On 2024/9/12 2:41, Richard Henderson wrote:
>>> On 9/11/24 06:26, LIU Zhiwei wrote:
>>>> From: Swung0x48<swung0x48@outlook.com>
>>>>
>>>> The RISC-V vector instruction set utilizes the LMUL field to group
>>>> multiple registers, enabling variable-length vector registers. This
>>>> implementation uses only the first register number of each group while
>>>> reserving the other register numbers within the group.
>>>>
>>>> In TCG, each VEC_IR can have 3 types (TCG_TYPE_V64/128/256), and the
>>>> host runtime needs to adjust LMUL based on the type to use different
>>>> register groups.
>>>>
>>>> This presents challenges for TCG's register allocation. Currently, we
>>>> avoid modifying the register allocation part of TCG and only expose 
>>>> the
>>>> minimum number of vector registers.
>>>>
>>>> For example, when the host vlen is 64 bits and type is 
>>>> TCG_TYPE_V256, with
>>>> LMUL equal to 4, we use 4 vector registers as one register group. 
>>>> We can
>>>> use a maximum of 8 register groups, but the V0 register number is 
>>>> reserved
>>>> as a mask register, so we can effectively use at most 7 register 
>>>> groups.
>>>> Moreover, when type is smaller than TCG_TYPE_V256, only 7 registers 
>>>> are
>>>> forced to be used. This is because TCG cannot yet dynamically 
>>>> constrain
>>>> registers with type; likewise, when the host vlen is 128 bits and
>>>> TCG_TYPE_V256, we can use at most 15 registers.
>>>>
>>>> There is not much pressure on vector register allocation in TCG 
>>>> now, so
>>>> using 7 registers is feasible and will not have a major impact on code
>>>> generation.
>>>>
>>>> This patch:
>>>> 1. Reserves vector register 0 for use as a mask register.
>>>> 2. When using register groups, reserves the additional registers 
>>>> within
>>>>     each group.
>>>>
>>>> Signed-off-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
>>>> Co-authored-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
>>>
>>> If there is a co-author, there should be another Signed-off-by.
>>
>> This patch has added a tag:
>>
>> Signed-off-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
>>
>>
>> Do you mean we should add the same tag twice?
>
> The from line is "Swung0x48 <swung0x48@outlook.com>".
> If this is an alternate email for TANG Tiancheng,

No, Swung0x48 is another author.

Thanks,
Zhiwei

> then please fix the patch --author.
>
>
> r~
Richard Henderson Sept. 18, 2024, 2:27 p.m. UTC | #5
On 9/18/24 12:43, LIU Zhiwei wrote:
> 
> On 2024/9/18 18:11, Richard Henderson wrote:
>> On 9/18/24 07:17, LIU Zhiwei wrote:
>>>
>>> On 2024/9/12 2:41, Richard Henderson wrote:
>>>> On 9/11/24 06:26, LIU Zhiwei wrote:
>>>>> From: Swung0x48<swung0x48@outlook.com>
>>>>>
>>>>> The RISC-V vector instruction set utilizes the LMUL field to group
>>>>> multiple registers, enabling variable-length vector registers. This
>>>>> implementation uses only the first register number of each group while
>>>>> reserving the other register numbers within the group.
>>>>>
>>>>> In TCG, each VEC_IR can have 3 types (TCG_TYPE_V64/128/256), and the
>>>>> host runtime needs to adjust LMUL based on the type to use different
>>>>> register groups.
>>>>>
>>>>> This presents challenges for TCG's register allocation. Currently, we
>>>>> avoid modifying the register allocation part of TCG and only expose the
>>>>> minimum number of vector registers.
>>>>>
>>>>> For example, when the host vlen is 64 bits and type is TCG_TYPE_V256, with
>>>>> LMUL equal to 4, we use 4 vector registers as one register group. We can
>>>>> use a maximum of 8 register groups, but the V0 register number is reserved
>>>>> as a mask register, so we can effectively use at most 7 register groups.
>>>>> Moreover, when type is smaller than TCG_TYPE_V256, only 7 registers are
>>>>> forced to be used. This is because TCG cannot yet dynamically constrain
>>>>> registers with type; likewise, when the host vlen is 128 bits and
>>>>> TCG_TYPE_V256, we can use at most 15 registers.
>>>>>
>>>>> There is not much pressure on vector register allocation in TCG now, so
>>>>> using 7 registers is feasible and will not have a major impact on code
>>>>> generation.
>>>>>
>>>>> This patch:
>>>>> 1. Reserves vector register 0 for use as a mask register.
>>>>> 2. When using register groups, reserves the additional registers within
>>>>>     each group.
>>>>>
>>>>> Signed-off-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
>>>>> Co-authored-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
>>>>
>>>> If there is a co-author, there should be another Signed-off-by.
>>>
>>> This patch has added a tag:
>>>
>>> Signed-off-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
>>>
>>>
>>> Do you mean we should add the same tag twice?
>>
>> The from line is "Swung0x48 <swung0x48@outlook.com>".
>> If this is an alternate email for TANG Tiancheng,
> 
> No, Swung0x48 is another author.

Then we need a proper Signed-off-by line from that author.


r~
0x48 Swung Sept. 20, 2024, 4:01 a.m. UTC | #6
Hey everyone! Late to the party. Life happens sometimes ;)
Just discovered this patch and this mail list, and I'd like to provide some background story here.
<https://github.com/plctlab/plct-qemu/tree/plct-riscv-backend-rvv>I originally provided my initial implementation in a downstream repo last year, namely https://github.com/plctlab/plct-qemu/tree/plct-riscv-backend-rvv .
I'm new to contributing to qemu and also take part in the open-source community upstreaming process as a whole, so I may make mistakes in my following claims, but I see some confusion here:
1. The PLCT branch (which includes my original commits) is open-sourced using GPLv2, which follows QEMU's upstream repo. So according to the license, my modification should be EXPLICITLY shown in the patch, but I haven't seen any.
2. I do consent upstreaming my patch last year, in the form of a patch submitted with modifications from T-head, and on behalf of them. And it was agreed back in the days that I can be mentioned as one of the authors. But it turns out that there's no "sign-off", "author", "co-author" line mentioning me. If I don't speak out in this situation, does it imply that this patch is purely LIU Zhiwei's work and have nothing to do with me?

I'd like LIU to separate my patch and his modification to two separate patches, and explicitly name where are those patches coming from, so that this patch can comply to GPLv2 license and can we clarify those misunderstandings.

I don't want to take it personally , but I do smell something's wrong going on here...

Best Regards,
Swung0x48 (aka. Huang Shiyuan)

Get Outlook for Android<https://aka.ms/AAb9ysg>
LIU Zhiwei Sept. 20, 2024, 4:27 a.m. UTC | #7
On 2024/9/20 12:01, 0x48 Swung wrote:
> Hey everyone! Late to the party. Life happens sometimes ;)
> Just discovered this patch and this mail list, and I'd like to provide 
> some background story here.
> I originally provided my initial implementation in a downstream repo 
> last year, namely 
> https://github.com/plctlab/plct-qemu/tree/plct-riscv-backend-rvv .
> I'm new to contributing to qemu and also take part in the open-source 
> community upstreaming process as a whole, so I may make mistakes in my 
> following claims, but I see some confusion here:
> 1. The PLCT branch (which includes my original commits) is 
> open-sourced using GPLv2, which follows QEMU's upstream repo. So 
> according to the license, my modification should be EXPLICITLY shown 
> in the patch, but I haven't seen any.
I think I have carefully processed it.
> 2. I do consent upstreaming my patch last year, in the form of a patch 
> submitted with modifications from T-head, and on behalf of them. And 
> it was agreed back in the days that I can be mentioned as one of the 
> authors. But it turnsout that there's no "sign-off", "author", 
> "co-author" line mentioning me.

The author of this patch is you. You can see it from the "From: 
Swung0x48<swung0x48@outlook.com>" in the patch.

In V4, TianCheng thinks he also have done some contribution to this 
patch. Thus he adds himself as a co-author.

> If I don't speak out in this situation, does it imply that this patch 
> is purely LIU Zhiwei's work and have nothing to do with me?
No. I just review this patch set and sent it to the mail list. None of 
this patch set belong to me.
>
> I'd like LIU to separate my patch and his modification to two separate 
> patches, and explicitly name where are those patches coming from, so 
> that this patch can comply to GPLv2 license and can we clarify those 
> misunderstandings.

I think we have done it. Welcome to point out my mistake if you find some.

Thanks,
Zhiwei

>
> I don't want to take it personally , but I do smell something's wrong 
> going on here...
>
> Best Regards,
> Swung0x48 (aka. Huang Shiyuan)
>
> Get Outlook for Android <https://aka.ms/AAb9ysg>
> ------------------------------------------------------------------------
> *From:* Richard Henderson <richard.henderson@linaro.org>
> *Sent:* Wednesday, September 18, 2024 10:27:16 PM
> *To:* LIU Zhiwei <zhiwei_liu@linux.alibaba.com>; qemu-devel@nongnu.org 
> <qemu-devel@nongnu.org>
> *Cc:* qemu-riscv@nongnu.org <qemu-riscv@nongnu.org>; 
> palmer@dabbelt.com <palmer@dabbelt.com>; alistair.francis@wdc.com 
> <alistair.francis@wdc.com>; dbarboza@ventanamicro.com 
> <dbarboza@ventanamicro.com>; liwei1518@gmail.com 
> <liwei1518@gmail.com>; bmeng.cn@gmail.com <bmeng.cn@gmail.com>; 
> Swung0x48 <swung0x48@outlook.com>; TANG Tiancheng 
> <tangtiancheng.ttc@alibaba-inc.com>
> *Subject:* Re: [PATCH v4 02/12] tcg/riscv: Add basic support for vector
> On 9/18/24 12:43, LIU Zhiwei wrote:
> >
> > On 2024/9/18 18:11, Richard Henderson wrote:
> >> On 9/18/24 07:17, LIU Zhiwei wrote:
> >>>
> >>> On 2024/9/12 2:41, Richard Henderson wrote:
> >>>> On 9/11/24 06:26, LIU Zhiwei wrote:
> >>>>> From: Swung0x48<swung0x48@outlook.com>
> >>>>>
> >>>>> The RISC-V vector instruction set utilizes the LMUL field to group
> >>>>> multiple registers, enabling variable-length vector registers. This
> >>>>> implementation uses only the first register number of each group 
> while
> >>>>> reserving the other register numbers within the group.
> >>>>>
> >>>>> In TCG, each VEC_IR can have 3 types (TCG_TYPE_V64/128/256), and the
> >>>>> host runtime needs to adjust LMUL based on the type to use different
> >>>>> register groups.
> >>>>>
> >>>>> This presents challenges for TCG's register allocation. 
> Currently, we
> >>>>> avoid modifying the register allocation part of TCG and only 
> expose the
> >>>>> minimum number of vector registers.
> >>>>>
> >>>>> For example, when the host vlen is 64 bits and type is 
> TCG_TYPE_V256, with
> >>>>> LMUL equal to 4, we use 4 vector registers as one register 
> group. We can
> >>>>> use a maximum of 8 register groups, but the V0 register number 
> is reserved
> >>>>> as a mask register, so we can effectively use at most 7 register 
> groups.
> >>>>> Moreover, when type is smaller than TCG_TYPE_V256, only 7 
> registers are
> >>>>> forced to be used. This is because TCG cannot yet dynamically 
> constrain
> >>>>> registers with type; likewise, when the host vlen is 128 bits and
> >>>>> TCG_TYPE_V256, we can use at most 15 registers.
> >>>>>
> >>>>> There is not much pressure on vector register allocation in TCG 
> now, so
> >>>>> using 7 registers is feasible and will not have a major impact 
> on code
> >>>>> generation.
> >>>>>
> >>>>> This patch:
> >>>>> 1. Reserves vector register 0 for use as a mask register.
> >>>>> 2. When using register groups, reserves the additional registers 
> within
> >>>>>     each group.
> >>>>>
> >>>>> Signed-off-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
> >>>>> Co-authored-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
> >>>>
> >>>> If there is a co-author, there should be another Signed-off-by.
> >>>
> >>> This patch has added a tag:
> >>>
> >>> Signed-off-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
> >>>
> >>>
> >>> Do you mean we should add the same tag twice?
> >>
> >> The from line is "Swung0x48 <swung0x48@outlook.com>".
> >> If this is an alternate email for TANG Tiancheng,
> >
> > No, Swung0x48 is another author.
>
> Then we need a proper Signed-off-by line from that author.
>
>
> r~
Daniel Henrique Barboza Sept. 20, 2024, 11:26 a.m. UTC | #8
Hi Zhiwei,

On 9/11/24 10:26 AM, LIU Zhiwei wrote:
> From: Swung0x48 <swung0x48@outlook.com>
> 
> The RISC-V vector instruction set utilizes the LMUL field to group
> multiple registers, enabling variable-length vector registers. This
> implementation uses only the first register number of each group while
> reserving the other register numbers within the group.
> 
> In TCG, each VEC_IR can have 3 types (TCG_TYPE_V64/128/256), and the
> host runtime needs to adjust LMUL based on the type to use different
> register groups.
> 
> This presents challenges for TCG's register allocation. Currently, we
> avoid modifying the register allocation part of TCG and only expose the
> minimum number of vector registers.
> 
> For example, when the host vlen is 64 bits and type is TCG_TYPE_V256, with
> LMUL equal to 4, we use 4 vector registers as one register group. We can
> use a maximum of 8 register groups, but the V0 register number is reserved
> as a mask register, so we can effectively use at most 7 register groups.
> Moreover, when type is smaller than TCG_TYPE_V256, only 7 registers are
> forced to be used. This is because TCG cannot yet dynamically constrain
> registers with type; likewise, when the host vlen is 128 bits and
> TCG_TYPE_V256, we can use at most 15 registers.
> 
> There is not much pressure on vector register allocation in TCG now, so
> using 7 registers is feasible and will not have a major impact on code
> generation.
> 
> This patch:
> 1. Reserves vector register 0 for use as a mask register.
> 2. When using register groups, reserves the additional registers within
>     each group.
> 
> Signed-off-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
> Co-authored-by: TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
> Reviewed-by: Liu Zhiwei <zhiwei_liu@linux.alibaba.com>
> ---


As Rixchard already pointed out, we must have a "Signed-off-by" tag with the "author" of
the patch, and it must be the exact spelling. So in this case:

Signed-off-by: Swung0x48 <swung0x48@outlook.com>


More info here:

https://www.qemu.org/docs/master/devel/submitting-a-patch.html

-----

Your patches must include a Signed-off-by: line. This is a hard requirement
because it’s how you say “I’m legally okay to contribute this and happy for it
to go into QEMU”. The process is modelled after the Linux kernel policy.

If you wrote the patch, make sure your "From:" and "Signed-off-by:"
lines use the same spelling. It's okay if you subscribe or contribute to
the list via more than one address, but using multiple addresses in one
commit just confuses things. If someone else wrote the patch, git will
include a "From:" line in the body of the email (different from your
envelope From:) that will give credit to the correct author; but again,
that author's Signed-off-by: line is mandatory, with the same spelling.

-----

However, you can't just amend this tag in the patch though since you're not Swung0x48.
We need Swung0x48 to reply here ack indicating that it is ok to add the Signed-off-by
as required, as a indication that Swung0x48 is ok with the legal implications of
doing so.


Thanks,

Daniel


>   tcg/riscv/tcg-target-con-str.h |   1 +
>   tcg/riscv/tcg-target.c.inc     | 126 ++++++++++++++++++++++++---------
>   tcg/riscv/tcg-target.h         |  78 +++++++++++---------
>   tcg/riscv/tcg-target.opc.h     |  12 ++++
>   4 files changed, 151 insertions(+), 66 deletions(-)
>   create mode 100644 tcg/riscv/tcg-target.opc.h
> 
> diff --git a/tcg/riscv/tcg-target-con-str.h b/tcg/riscv/tcg-target-con-str.h
> index d5c419dff1..b2b3211bcb 100644
> --- a/tcg/riscv/tcg-target-con-str.h
> +++ b/tcg/riscv/tcg-target-con-str.h
> @@ -9,6 +9,7 @@
>    * REGS(letter, register_mask)
>    */
>   REGS('r', ALL_GENERAL_REGS)
> +REGS('v', ALL_VECTOR_REGS)
>   
>   /*
>    * Define constraint letters for constants:
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index d334857226..966d1ad981 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -32,38 +32,14 @@
>   
>   #ifdef CONFIG_DEBUG_TCG
>   static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
> -    "zero",
> -    "ra",
> -    "sp",
> -    "gp",
> -    "tp",
> -    "t0",
> -    "t1",
> -    "t2",
> -    "s0",
> -    "s1",
> -    "a0",
> -    "a1",
> -    "a2",
> -    "a3",
> -    "a4",
> -    "a5",
> -    "a6",
> -    "a7",
> -    "s2",
> -    "s3",
> -    "s4",
> -    "s5",
> -    "s6",
> -    "s7",
> -    "s8",
> -    "s9",
> -    "s10",
> -    "s11",
> -    "t3",
> -    "t4",
> -    "t5",
> -    "t6"
> +    "zero", "ra",  "sp",  "gp",  "tp",  "t0",  "t1",  "t2",
> +    "s0",   "s1",  "a0",  "a1",  "a2",  "a3",  "a4",  "a5",
> +    "a6",   "a7",  "s2",  "s3",  "s4",  "s5",  "s6",  "s7",
> +    "s8",   "s9",  "s10", "s11", "t3",  "t4",  "t5",  "t6",
> +    "v0",   "v1",  "v2",  "v3",  "v4",  "v5",  "v6",  "v7",
> +    "v8",   "v9",  "v10", "v11", "v12", "v13", "v14", "v15",
> +    "v16",  "v17", "v18", "v19", "v20", "v21", "v22", "v23",
> +    "v24",  "v25", "v26", "v27", "v28", "v29", "v30", "v31",
>   };
>   #endif
>   
> @@ -100,6 +76,16 @@ static const int tcg_target_reg_alloc_order[] = {
>       TCG_REG_A5,
>       TCG_REG_A6,
>       TCG_REG_A7,
> +
> +    /* Vector registers and TCG_REG_V0 reserved for mask. */
> +    TCG_REG_V1,  TCG_REG_V2,  TCG_REG_V3,  TCG_REG_V4,
> +    TCG_REG_V5,  TCG_REG_V6,  TCG_REG_V7,  TCG_REG_V8,
> +    TCG_REG_V9,  TCG_REG_V10, TCG_REG_V11, TCG_REG_V12,
> +    TCG_REG_V13, TCG_REG_V14, TCG_REG_V15, TCG_REG_V16,
> +    TCG_REG_V17, TCG_REG_V18, TCG_REG_V19, TCG_REG_V20,
> +    TCG_REG_V21, TCG_REG_V22, TCG_REG_V23, TCG_REG_V24,
> +    TCG_REG_V25, TCG_REG_V26, TCG_REG_V27, TCG_REG_V28,
> +    TCG_REG_V29, TCG_REG_V30, TCG_REG_V31,
>   };
>   
>   static const int tcg_target_call_iarg_regs[] = {
> @@ -127,6 +113,9 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
>   #define TCG_CT_CONST_J12  0x1000
>   
>   #define ALL_GENERAL_REGS   MAKE_64BIT_MASK(0, 32)
> +#define ALL_VECTOR_REGS    MAKE_64BIT_MASK(32, 32)
> +#define ALL_DVECTOR_REG_GROUPS 0x5555555500000000
> +#define ALL_QVECTOR_REG_GROUPS 0x1111111100000000
>   
>   #define sextreg  sextract64
>   
> @@ -766,6 +755,23 @@ static void tcg_out_addsub2(TCGContext *s,
>       }
>   }
>   
> +static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece,
> +                                   TCGReg dst, TCGReg src)
> +{
> +    return false;
> +}
> +
> +static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece,
> +                                    TCGReg dst, TCGReg base, intptr_t offset)
> +{
> +    return false;
> +}
> +
> +static void tcg_out_dupi_vec(TCGContext *s, TCGType type, unsigned vece,
> +                                    TCGReg dst, int64_t arg)
> +{
> +}
> +
>   static const struct {
>       RISCVInsn op;
>       bool swap;
> @@ -1881,6 +1887,36 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
>       }
>   }
>   
> +static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
> +                           unsigned vecl, unsigned vece,
> +                           const TCGArg args[TCG_MAX_OP_ARGS],
> +                           const int const_args[TCG_MAX_OP_ARGS])
> +{
> +    switch (opc) {
> +    case INDEX_op_mov_vec: /* Always emitted via tcg_out_mov.  */
> +    case INDEX_op_dup_vec: /* Always emitted via tcg_out_dup_vec.  */
> +    default:
> +        g_assert_not_reached();
> +    }
> +}
> +
> +void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece,
> +                       TCGArg a0, ...)
> +{
> +    switch (opc) {
> +    default:
> +        g_assert_not_reached();
> +    }
> +}
> +
> +int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece)
> +{
> +    switch (opc) {
> +    default:
> +        return 0;
> +    }
> +}
> +
>   static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
>   {
>       switch (op) {
> @@ -2100,6 +2136,30 @@ static void tcg_target_init(TCGContext *s)
>   {
>       tcg_target_available_regs[TCG_TYPE_I32] = 0xffffffff;
>       tcg_target_available_regs[TCG_TYPE_I64] = 0xffffffff;
> +    s->reserved_regs = 0;
> +
> +    switch (riscv_lg2_vlenb) {
> +    case TCG_TYPE_V64:
> +        tcg_target_available_regs[TCG_TYPE_V64] = ALL_VECTOR_REGS;
> +        tcg_target_available_regs[TCG_TYPE_V128] = ALL_DVECTOR_REG_GROUPS;
> +        tcg_target_available_regs[TCG_TYPE_V256] = ALL_QVECTOR_REG_GROUPS;
> +        s->reserved_regs |= (~ALL_QVECTOR_REG_GROUPS & ALL_VECTOR_REGS);
> +        break;
> +    case TCG_TYPE_V128:
> +        tcg_target_available_regs[TCG_TYPE_V64] = ALL_VECTOR_REGS;
> +        tcg_target_available_regs[TCG_TYPE_V128] = ALL_VECTOR_REGS;
> +        tcg_target_available_regs[TCG_TYPE_V256] = ALL_DVECTOR_REG_GROUPS;
> +        s->reserved_regs |= (~ALL_DVECTOR_REG_GROUPS & ALL_VECTOR_REGS);
> +        break;
> +    default:
> +        /* Guaranteed by Zve64x. */
> +        tcg_debug_assert(riscv_lg2_vlenb >= TCG_TYPE_V256);
> +
> +        tcg_target_available_regs[TCG_TYPE_V64] = ALL_VECTOR_REGS;
> +        tcg_target_available_regs[TCG_TYPE_V128] = ALL_VECTOR_REGS;
> +        tcg_target_available_regs[TCG_TYPE_V256] = ALL_VECTOR_REGS;
> +        break;
> +    }
>   
>       tcg_target_call_clobber_regs = -1u;
>       tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_S0);
> @@ -2115,7 +2175,6 @@ static void tcg_target_init(TCGContext *s)
>       tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_S10);
>       tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_S11);
>   
> -    s->reserved_regs = 0;
>       tcg_regset_set_reg(s->reserved_regs, TCG_REG_ZERO);
>       tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP0);
>       tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP1);
> @@ -2123,6 +2182,7 @@ static void tcg_target_init(TCGContext *s)
>       tcg_regset_set_reg(s->reserved_regs, TCG_REG_SP);
>       tcg_regset_set_reg(s->reserved_regs, TCG_REG_GP);
>       tcg_regset_set_reg(s->reserved_regs, TCG_REG_TP);
> +    tcg_regset_set_reg(s->reserved_regs, TCG_REG_V0);
>   }
>   
>   typedef struct {
> diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
> index 1a347eaf6e..12a7a37aaa 100644
> --- a/tcg/riscv/tcg-target.h
> +++ b/tcg/riscv/tcg-target.h
> @@ -28,42 +28,28 @@
>   #include "host/cpuinfo.h"
>   
>   #define TCG_TARGET_INSN_UNIT_SIZE 4
> -#define TCG_TARGET_NB_REGS 32
> +#define TCG_TARGET_NB_REGS 64
>   #define MAX_CODE_GEN_BUFFER_SIZE  ((size_t)-1)
>   
>   typedef enum {
> -    TCG_REG_ZERO,
> -    TCG_REG_RA,
> -    TCG_REG_SP,
> -    TCG_REG_GP,
> -    TCG_REG_TP,
> -    TCG_REG_T0,
> -    TCG_REG_T1,
> -    TCG_REG_T2,
> -    TCG_REG_S0,
> -    TCG_REG_S1,
> -    TCG_REG_A0,
> -    TCG_REG_A1,
> -    TCG_REG_A2,
> -    TCG_REG_A3,
> -    TCG_REG_A4,
> -    TCG_REG_A5,
> -    TCG_REG_A6,
> -    TCG_REG_A7,
> -    TCG_REG_S2,
> -    TCG_REG_S3,
> -    TCG_REG_S4,
> -    TCG_REG_S5,
> -    TCG_REG_S6,
> -    TCG_REG_S7,
> -    TCG_REG_S8,
> -    TCG_REG_S9,
> -    TCG_REG_S10,
> -    TCG_REG_S11,
> -    TCG_REG_T3,
> -    TCG_REG_T4,
> -    TCG_REG_T5,
> -    TCG_REG_T6,
> +    TCG_REG_ZERO, TCG_REG_RA,  TCG_REG_SP,  TCG_REG_GP,
> +    TCG_REG_TP,   TCG_REG_T0,  TCG_REG_T1,  TCG_REG_T2,
> +    TCG_REG_S0,   TCG_REG_S1,  TCG_REG_A0,  TCG_REG_A1,
> +    TCG_REG_A2,   TCG_REG_A3,  TCG_REG_A4,  TCG_REG_A5,
> +    TCG_REG_A6,   TCG_REG_A7,  TCG_REG_S2,  TCG_REG_S3,
> +    TCG_REG_S4,   TCG_REG_S5,  TCG_REG_S6,  TCG_REG_S7,
> +    TCG_REG_S8,   TCG_REG_S9,  TCG_REG_S10, TCG_REG_S11,
> +    TCG_REG_T3,   TCG_REG_T4,  TCG_REG_T5,  TCG_REG_T6,
> +
> +    /* RISC-V V Extension registers */
> +    TCG_REG_V0,   TCG_REG_V1,  TCG_REG_V2,  TCG_REG_V3,
> +    TCG_REG_V4,   TCG_REG_V5,  TCG_REG_V6,  TCG_REG_V7,
> +    TCG_REG_V8,   TCG_REG_V9,  TCG_REG_V10, TCG_REG_V11,
> +    TCG_REG_V12,  TCG_REG_V13, TCG_REG_V14, TCG_REG_V15,
> +    TCG_REG_V16,  TCG_REG_V17, TCG_REG_V18, TCG_REG_V19,
> +    TCG_REG_V20,  TCG_REG_V21, TCG_REG_V22, TCG_REG_V23,
> +    TCG_REG_V24,  TCG_REG_V25, TCG_REG_V26, TCG_REG_V27,
> +    TCG_REG_V28,  TCG_REG_V29, TCG_REG_V30, TCG_REG_V31,
>   
>       /* aliases */
>       TCG_AREG0          = TCG_REG_S0,
> @@ -156,6 +142,32 @@ typedef enum {
>   
>   #define TCG_TARGET_HAS_tst              0
>   
> +/* vector instructions */
> +#define TCG_TARGET_HAS_v64              0
> +#define TCG_TARGET_HAS_v128             0
> +#define TCG_TARGET_HAS_v256             0
> +#define TCG_TARGET_HAS_andc_vec         0
> +#define TCG_TARGET_HAS_orc_vec          0
> +#define TCG_TARGET_HAS_nand_vec         0
> +#define TCG_TARGET_HAS_nor_vec          0
> +#define TCG_TARGET_HAS_eqv_vec          0
> +#define TCG_TARGET_HAS_not_vec          0
> +#define TCG_TARGET_HAS_neg_vec          0
> +#define TCG_TARGET_HAS_abs_vec          0
> +#define TCG_TARGET_HAS_roti_vec         0
> +#define TCG_TARGET_HAS_rots_vec         0
> +#define TCG_TARGET_HAS_rotv_vec         0
> +#define TCG_TARGET_HAS_shi_vec          0
> +#define TCG_TARGET_HAS_shs_vec          0
> +#define TCG_TARGET_HAS_shv_vec          0
> +#define TCG_TARGET_HAS_mul_vec          0
> +#define TCG_TARGET_HAS_sat_vec          0
> +#define TCG_TARGET_HAS_minmax_vec       0
> +#define TCG_TARGET_HAS_bitsel_vec       0
> +#define TCG_TARGET_HAS_cmpsel_vec       0
> +
> +#define TCG_TARGET_HAS_tst_vec          0
> +
>   #define TCG_TARGET_DEFAULT_MO (0)
>   
>   #define TCG_TARGET_NEED_LDST_LABELS
> diff --git a/tcg/riscv/tcg-target.opc.h b/tcg/riscv/tcg-target.opc.h
> new file mode 100644
> index 0000000000..b80b39e1e5
> --- /dev/null
> +++ b/tcg/riscv/tcg-target.opc.h
> @@ -0,0 +1,12 @@
> +/*
> + * Copyright (c) C-SKY Microsystems Co., Ltd.
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or
> + * (at your option) any later version.
> + *
> + * See the COPYING file in the top-level directory for details.
> + *
> + * Target-specific opcodes for host vector expansion.  These will be
> + * emitted by tcg_expand_vec_op.  For those familiar with GCC internals,
> + * consider these to be UNSPEC with names.
> + */
Markus Armbruster Sept. 20, 2024, 11:37 a.m. UTC | #9
Daniel Henrique Barboza <dbarboza@ventanamicro.com> writes:

> Hi Zhiwei,
>
> As Rixchard already pointed out, we must have a "Signed-off-by" tag with the "author" of
> the patch, and it must be the exact spelling. So in this case:
>
> Signed-off-by: Swung0x48 <swung0x48@outlook.com>

I'm afraid we need a legal name here, not a nickname.

> More info here:
>
> https://www.qemu.org/docs/master/devel/submitting-a-patch.html

[...]
LIU Zhiwei Sept. 20, 2024, 2:26 p.m. UTC | #10
On 2024/9/20 12:01, 0x48 Swung wrote:
> Hey everyone! Late to the party. Life happens sometimes ;)
> Just discovered this patch and this mail list, and I'd like to provide 
> some background story here.
> I originally provided my initial implementation in a downstream repo 
> last year, namely 
> https://github.com/plctlab/plct-qemu/tree/plct-riscv-backend-rvv .
> I'm new to contributing to qemu and also take part in the open-source 
> community upstreaming process as a whole, so I may make mistakes in my 
> following claims, but I see some confusion here:
> 1. The PLCT branch (which includes my original commits) is 
> open-sourced using GPLv2, which follows QEMU's upstream repo. So 
> according to the license, my modification should be EXPLICITLY shown 
> in the patch, but I haven't seen any.
> 2. I do consent upstreaming my patch last year, in the form of a patch 
> submitted with modifications from T-head, and on behalf of them. And 
> it was agreed back in the days that I can be mentioned as one of the 
> authors. But it turnsout that there's no "sign-off", "author", 
> "co-author" line mentioning me. If I don't speak out in this 
> situation, does it imply that this patch is purely LIU Zhiwei's work 
> and have nothing to do with me?
>
> I'd like LIU to separate my patch and his modification to two separate 
> patches, and explicitly name where are those patches coming from, so 
> that this patch can comply to GPLv2 license and can we clarify those 
> misunderstandings.
>
> I don't want to take it personally , but I do smell something's wrong 
> going on here...

I think there was a misunderstanding. But I will not explain it too much 
here. If you agree, please don't block this work and send the tag as 
Daniel and Markus point out.

Thanks,
Zhiwei

>
> Best Regards,
> Swung0x48 (aka. Huang Shiyuan)
>
> Get Outlook for Android <https://aka.ms/AAb9ysg>
> ------------------------------------------------------------------------
> *From:* Richard Henderson <richard.henderson@linaro.org>
> *Sent:* Wednesday, September 18, 2024 10:27:16 PM
> *To:* LIU Zhiwei <zhiwei_liu@linux.alibaba.com>; qemu-devel@nongnu.org 
> <qemu-devel@nongnu.org>
> *Cc:* qemu-riscv@nongnu.org <qemu-riscv@nongnu.org>; 
> palmer@dabbelt.com <palmer@dabbelt.com>; alistair.francis@wdc.com 
> <alistair.francis@wdc.com>; dbarboza@ventanamicro.com 
> <dbarboza@ventanamicro.com>; liwei1518@gmail.com 
> <liwei1518@gmail.com>; bmeng.cn@gmail.com <bmeng.cn@gmail.com>; 
> Swung0x48 <swung0x48@outlook.com>; TANG Tiancheng 
> <tangtiancheng.ttc@alibaba-inc.com>
> *Subject:* Re: [PATCH v4 02/12] tcg/riscv: Add basic support for vector
> On 9/18/24 12:43, LIU Zhiwei wrote:
> >
> > On 2024/9/18 18:11, Richard Henderson wrote:
> >> On 9/18/24 07:17, LIU Zhiwei wrote:
> >>>
> >>> On 2024/9/12 2:41, Richard Henderson wrote:
> >>>> On 9/11/24 06:26, LIU Zhiwei wrote:
> >>>>> From: Swung0x48<swung0x48@outlook.com>
> >>>>>
> >>>>> The RISC-V vector instruction set utilizes the LMUL field to group
> >>>>> multiple registers, enabling variable-length vector registers. This
> >>>>> implementation uses only the first register number of each group 
> while
> >>>>> reserving the other register numbers within the group.
> >>>>>
> >>>>> In TCG, each VEC_IR can have 3 types (TCG_TYPE_V64/128/256), and the
> >>>>> host runtime needs to adjust LMUL based on the type to use different
> >>>>> register groups.
> >>>>>
> >>>>> This presents challenges for TCG's register allocation. 
> Currently, we
> >>>>> avoid modifying the register allocation part of TCG and only 
> expose the
> >>>>> minimum number of vector registers.
> >>>>>
> >>>>> For example, when the host vlen is 64 bits and type is 
> TCG_TYPE_V256, with
> >>>>> LMUL equal to 4, we use 4 vector registers as one register 
> group. We can
> >>>>> use a maximum of 8 register groups, but the V0 register number 
> is reserved
> >>>>> as a mask register, so we can effectively use at most 7 register 
> groups.
> >>>>> Moreover, when type is smaller than TCG_TYPE_V256, only 7 
> registers are
> >>>>> forced to be used. This is because TCG cannot yet dynamically 
> constrain
> >>>>> registers with type; likewise, when the host vlen is 128 bits and
> >>>>> TCG_TYPE_V256, we can use at most 15 registers.
> >>>>>
> >>>>> There is not much pressure on vector register allocation in TCG 
> now, so
> >>>>> using 7 registers is feasible and will not have a major impact 
> on code
> >>>>> generation.
> >>>>>
> >>>>> This patch:
> >>>>> 1. Reserves vector register 0 for use as a mask register.
> >>>>> 2. When using register groups, reserves the additional registers 
> within
> >>>>>     each group.
> >>>>>
> >>>>> Signed-off-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
> >>>>> Co-authored-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
> >>>>
> >>>> If there is a co-author, there should be another Signed-off-by.
> >>>
> >>> This patch has added a tag:
> >>>
> >>> Signed-off-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
> >>>
> >>>
> >>> Do you mean we should add the same tag twice?
> >>
> >> The from line is "Swung0x48 <swung0x48@outlook.com>".
> >> If this is an alternate email for TANG Tiancheng,
> >
> > No, Swung0x48 is another author.
>
> Then we need a proper Signed-off-by line from that author.
>
>
> r~
0x48 Swung Sept. 21, 2024, 3:56 p.m. UTC | #11
Signed-off-by: Huang Shiyuan <swung0x48@outlook.com<mailto:swung0x48@outlook.com>>

This is the tag. Is this fine or do I need to do something else? Thanks for the help from everybody in this list!

在 2024年9月20日,22:28,LIU Zhiwei <zhiwei_liu@linux.alibaba.com> 写道:




On 2024/9/20 12:01, 0x48 Swung wrote:
Hey everyone! Late to the party. Life happens sometimes ;)
Just discovered this patch and this mail list, and I'd like to provide some background story here.
I originally provided my initial implementation in a downstream repo last year, namely https://github.com/plctlab/plct-qemu/tree/plct-riscv-backend-rvv .
I'm new to contributing to qemu and also take part in the open-source community upstreaming process as a whole, so I may make mistakes in my following claims, but I see some confusion here:
1. The PLCT branch (which includes my original commits) is open-sourced using GPLv2, which follows QEMU's upstream repo. So according to the license, my modification should be EXPLICITLY shown in the patch, but I haven't seen any.
2. I do consent upstreaming my patch last year, in the form of a patch submitted with modifications from T-head, and on behalf of them. And it was agreed back in the days that I can be mentioned as one of the authors. But it turns out that there's no "sign-off", "author", "co-author" line mentioning me. If I don't speak out in this situation, does it imply that this patch is purely LIU Zhiwei's work and have nothing to do with me?

I'd like LIU to separate my patch and his modification to two separate patches, and explicitly name where are those patches coming from, so that this patch can comply to GPLv2 license and can we clarify those misunderstandings.

I don't want to take it personally , but I do smell something's wrong going on here...

I think there was a misunderstanding. But I will not explain it too much here. If you agree, please don't block this work and send the tag as Daniel and Markus point out.

Thanks,
Zhiwei

Best Regards,
Swung0x48 (aka. Huang Shiyuan)

Get Outlook for Android<https://aka.ms/AAb9ysg>
Daniel Henrique Barboza Sept. 21, 2024, 5:17 p.m. UTC | #12
On 9/21/24 12:56 PM, 0x48 Swung wrote:
> Signed-off-by: Huang Shiyuan <swung0x48@outlook.com <mailto:swung0x48@outlook.com>>
> 
> This is the tag. Is this fine or do I need to do something else? Thanks for the help from everybody in this list!

Thanks! This is enough. Zhiwei can add the tag in the patch in v5.


Daniel

> 
>> 在 2024年9月20日,22:28,LIU Zhiwei <zhiwei_liu@linux.alibaba.com> 写道:
>>
>> 
>>
>>
>> On 2024/9/20 12:01, 0x48 Swung wrote:
>>> Hey everyone! Late to the party. Life happens sometimes ;)
>>> Just discovered this patch and this mail list, and I'd like to provide some background story here.
>>> I originally provided my initial implementation in a downstream repo last year, namely https://github.com/plctlab/plct-qemu/tree/plct-riscv-backend-rvv .
>>> I'm new to contributing to qemu and also take part in the open-source community upstreaming process as a whole, so I may make mistakes in my following claims, but I see some confusion here:
>>> 1. The PLCT branch (which includes my original commits) is open-sourced using GPLv2, which follows QEMU's upstream repo. So according to the license, my modification should be EXPLICITLY shown in the patch, but I haven't seen any.
>>> 2. I do consent upstreaming my patch last year, in the form of a patch submitted with modifications from T-head, and on behalf of them. And it was agreed back in the days that I can be mentioned as one of the authors. But it turnsout that there's no "sign-off", "author", "co-author" line mentioning me. If I don't speak out in this situation, does it imply that this patch is purely LIU Zhiwei's work and have nothing to do with me?
>>>
>>> I'd like LIU to separate my patch and his modification to two separate patches, and explicitly name where are those patches coming from, so that this patch can comply to GPLv2 license and can we clarify those misunderstandings.
>>>
>>> I don't want to take it personally , but I do smell something's wrong going on here...
>>
>> I think there was a misunderstanding. But I will not explain it too much here. If you agree, please don't block this work and send the tag as Daniel and Markus point out.
>>
>> Thanks,
>> Zhiwei
>>
>>>
>>> Best Regards,
>>> Swung0x48 (aka. Huang Shiyuan)
>>>
>>> Get Outlook for Android <https://aka.ms/AAb9ysg>
>>> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>> *From:* Richard Henderson <richard.henderson@linaro.org>
>>> *Sent:* Wednesday, September 18, 2024 10:27:16 PM
>>> *To:* LIU Zhiwei <zhiwei_liu@linux.alibaba.com>; qemu-devel@nongnu.org <qemu-devel@nongnu.org>
>>> *Cc:* qemu-riscv@nongnu.org <qemu-riscv@nongnu.org>; palmer@dabbelt.com <palmer@dabbelt.com>; alistair.francis@wdc.com <alistair.francis@wdc.com>; dbarboza@ventanamicro.com <dbarboza@ventanamicro.com>; liwei1518@gmail.com <liwei1518@gmail.com>; bmeng.cn@gmail.com <bmeng.cn@gmail.com>; Swung0x48 <swung0x48@outlook.com>; TANG Tiancheng <tangtiancheng.ttc@alibaba-inc.com>
>>> *Subject:* Re: [PATCH v4 02/12] tcg/riscv: Add basic support for vector
>>> On 9/18/24 12:43, LIU Zhiwei wrote:
>>> > 
>>> > On 2024/9/18 18:11, Richard Henderson wrote:
>>> >> On 9/18/24 07:17, LIU Zhiwei wrote:
>>> >>>
>>> >>> On 2024/9/12 2:41, Richard Henderson wrote:
>>> >>>> On 9/11/24 06:26, LIU Zhiwei wrote:
>>> >>>>> From: Swung0x48<swung0x48@outlook.com>
>>> >>>>>
>>> >>>>> The RISC-V vector instruction set utilizes the LMUL field to group
>>> >>>>> multiple registers, enabling variable-length vector registers. This
>>> >>>>> implementation uses only the first register number of each group while
>>> >>>>> reserving the other register numbers within the group.
>>> >>>>>
>>> >>>>> In TCG, each VEC_IR can have 3 types (TCG_TYPE_V64/128/256), and the
>>> >>>>> host runtime needs to adjust LMUL based on the type to use different
>>> >>>>> register groups.
>>> >>>>>
>>> >>>>> This presents challenges for TCG's register allocation. Currently, we
>>> >>>>> avoid modifying the register allocation part of TCG and only expose the
>>> >>>>> minimum number of vector registers.
>>> >>>>>
>>> >>>>> For example, when the host vlen is 64 bits and type is TCG_TYPE_V256, with
>>> >>>>> LMUL equal to 4, we use 4 vector registers as one register group. We can
>>> >>>>> use a maximum of 8 register groups, but the V0 register number is reserved
>>> >>>>> as a mask register, so we can effectively use at most 7 register groups.
>>> >>>>> Moreover, when type is smaller than TCG_TYPE_V256, only 7 registers are
>>> >>>>> forced to be used. This is because TCG cannot yet dynamically constrain
>>> >>>>> registers with type; likewise, when the host vlen is 128 bits and
>>> >>>>> TCG_TYPE_V256, we can use at most 15 registers.
>>> >>>>>
>>> >>>>> There is not much pressure on vector register allocation in TCG now, so
>>> >>>>> using 7 registers is feasible and will not have a major impact on code
>>> >>>>> generation.
>>> >>>>>
>>> >>>>> This patch:
>>> >>>>> 1. Reserves vector register 0 for use as a mask register.
>>> >>>>> 2. When using register groups, reserves the additional registers within
>>> >>>>>     each group.
>>> >>>>>
>>> >>>>> Signed-off-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
>>> >>>>> Co-authored-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
>>> >>>>
>>> >>>> If there is a co-author, there should be another Signed-off-by.
>>> >>>
>>> >>> This patch has added a tag:
>>> >>>
>>> >>> Signed-off-by: TANG Tiancheng<tangtiancheng.ttc@alibaba-inc.com>
>>> >>>
>>> >>>
>>> >>> Do you mean we should add the same tag twice?
>>> >>
>>> >> The from line is "Swung0x48 <swung0x48@outlook.com>".
>>> >> If this is an alternate email for TANG Tiancheng,
>>> > 
>>> > No, Swung0x48 is another author.
>>>
>>> Then we need a proper Signed-off-by line from that author.
>>>
>>>
>>> r~
diff mbox series

Patch

diff --git a/tcg/riscv/tcg-target-con-str.h b/tcg/riscv/tcg-target-con-str.h
index d5c419dff1..b2b3211bcb 100644
--- a/tcg/riscv/tcg-target-con-str.h
+++ b/tcg/riscv/tcg-target-con-str.h
@@ -9,6 +9,7 @@ 
  * REGS(letter, register_mask)
  */
 REGS('r', ALL_GENERAL_REGS)
+REGS('v', ALL_VECTOR_REGS)
 
 /*
  * Define constraint letters for constants:
diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
index d334857226..966d1ad981 100644
--- a/tcg/riscv/tcg-target.c.inc
+++ b/tcg/riscv/tcg-target.c.inc
@@ -32,38 +32,14 @@ 
 
 #ifdef CONFIG_DEBUG_TCG
 static const char * const tcg_target_reg_names[TCG_TARGET_NB_REGS] = {
-    "zero",
-    "ra",
-    "sp",
-    "gp",
-    "tp",
-    "t0",
-    "t1",
-    "t2",
-    "s0",
-    "s1",
-    "a0",
-    "a1",
-    "a2",
-    "a3",
-    "a4",
-    "a5",
-    "a6",
-    "a7",
-    "s2",
-    "s3",
-    "s4",
-    "s5",
-    "s6",
-    "s7",
-    "s8",
-    "s9",
-    "s10",
-    "s11",
-    "t3",
-    "t4",
-    "t5",
-    "t6"
+    "zero", "ra",  "sp",  "gp",  "tp",  "t0",  "t1",  "t2",
+    "s0",   "s1",  "a0",  "a1",  "a2",  "a3",  "a4",  "a5",
+    "a6",   "a7",  "s2",  "s3",  "s4",  "s5",  "s6",  "s7",
+    "s8",   "s9",  "s10", "s11", "t3",  "t4",  "t5",  "t6",
+    "v0",   "v1",  "v2",  "v3",  "v4",  "v5",  "v6",  "v7",
+    "v8",   "v9",  "v10", "v11", "v12", "v13", "v14", "v15",
+    "v16",  "v17", "v18", "v19", "v20", "v21", "v22", "v23",
+    "v24",  "v25", "v26", "v27", "v28", "v29", "v30", "v31",
 };
 #endif
 
@@ -100,6 +76,16 @@  static const int tcg_target_reg_alloc_order[] = {
     TCG_REG_A5,
     TCG_REG_A6,
     TCG_REG_A7,
+
+    /* Vector registers and TCG_REG_V0 reserved for mask. */
+    TCG_REG_V1,  TCG_REG_V2,  TCG_REG_V3,  TCG_REG_V4,
+    TCG_REG_V5,  TCG_REG_V6,  TCG_REG_V7,  TCG_REG_V8,
+    TCG_REG_V9,  TCG_REG_V10, TCG_REG_V11, TCG_REG_V12,
+    TCG_REG_V13, TCG_REG_V14, TCG_REG_V15, TCG_REG_V16,
+    TCG_REG_V17, TCG_REG_V18, TCG_REG_V19, TCG_REG_V20,
+    TCG_REG_V21, TCG_REG_V22, TCG_REG_V23, TCG_REG_V24,
+    TCG_REG_V25, TCG_REG_V26, TCG_REG_V27, TCG_REG_V28,
+    TCG_REG_V29, TCG_REG_V30, TCG_REG_V31,
 };
 
 static const int tcg_target_call_iarg_regs[] = {
@@ -127,6 +113,9 @@  static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
 #define TCG_CT_CONST_J12  0x1000
 
 #define ALL_GENERAL_REGS   MAKE_64BIT_MASK(0, 32)
+#define ALL_VECTOR_REGS    MAKE_64BIT_MASK(32, 32)
+#define ALL_DVECTOR_REG_GROUPS 0x5555555500000000
+#define ALL_QVECTOR_REG_GROUPS 0x1111111100000000
 
 #define sextreg  sextract64
 
@@ -766,6 +755,23 @@  static void tcg_out_addsub2(TCGContext *s,
     }
 }
 
+static bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece,
+                                   TCGReg dst, TCGReg src)
+{
+    return false;
+}
+
+static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece,
+                                    TCGReg dst, TCGReg base, intptr_t offset)
+{
+    return false;
+}
+
+static void tcg_out_dupi_vec(TCGContext *s, TCGType type, unsigned vece,
+                                    TCGReg dst, int64_t arg)
+{
+}
+
 static const struct {
     RISCVInsn op;
     bool swap;
@@ -1881,6 +1887,36 @@  static void tcg_out_op(TCGContext *s, TCGOpcode opc,
     }
 }
 
+static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc,
+                           unsigned vecl, unsigned vece,
+                           const TCGArg args[TCG_MAX_OP_ARGS],
+                           const int const_args[TCG_MAX_OP_ARGS])
+{
+    switch (opc) {
+    case INDEX_op_mov_vec: /* Always emitted via tcg_out_mov.  */
+    case INDEX_op_dup_vec: /* Always emitted via tcg_out_dup_vec.  */
+    default:
+        g_assert_not_reached();
+    }
+}
+
+void tcg_expand_vec_op(TCGOpcode opc, TCGType type, unsigned vece,
+                       TCGArg a0, ...)
+{
+    switch (opc) {
+    default:
+        g_assert_not_reached();
+    }
+}
+
+int tcg_can_emit_vec_op(TCGOpcode opc, TCGType type, unsigned vece)
+{
+    switch (opc) {
+    default:
+        return 0;
+    }
+}
+
 static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode op)
 {
     switch (op) {
@@ -2100,6 +2136,30 @@  static void tcg_target_init(TCGContext *s)
 {
     tcg_target_available_regs[TCG_TYPE_I32] = 0xffffffff;
     tcg_target_available_regs[TCG_TYPE_I64] = 0xffffffff;
+    s->reserved_regs = 0;
+
+    switch (riscv_lg2_vlenb) {
+    case TCG_TYPE_V64:
+        tcg_target_available_regs[TCG_TYPE_V64] = ALL_VECTOR_REGS;
+        tcg_target_available_regs[TCG_TYPE_V128] = ALL_DVECTOR_REG_GROUPS;
+        tcg_target_available_regs[TCG_TYPE_V256] = ALL_QVECTOR_REG_GROUPS;
+        s->reserved_regs |= (~ALL_QVECTOR_REG_GROUPS & ALL_VECTOR_REGS);
+        break;
+    case TCG_TYPE_V128:
+        tcg_target_available_regs[TCG_TYPE_V64] = ALL_VECTOR_REGS;
+        tcg_target_available_regs[TCG_TYPE_V128] = ALL_VECTOR_REGS;
+        tcg_target_available_regs[TCG_TYPE_V256] = ALL_DVECTOR_REG_GROUPS;
+        s->reserved_regs |= (~ALL_DVECTOR_REG_GROUPS & ALL_VECTOR_REGS);
+        break;
+    default:
+        /* Guaranteed by Zve64x. */
+        tcg_debug_assert(riscv_lg2_vlenb >= TCG_TYPE_V256);
+
+        tcg_target_available_regs[TCG_TYPE_V64] = ALL_VECTOR_REGS;
+        tcg_target_available_regs[TCG_TYPE_V128] = ALL_VECTOR_REGS;
+        tcg_target_available_regs[TCG_TYPE_V256] = ALL_VECTOR_REGS;
+        break;
+    }
 
     tcg_target_call_clobber_regs = -1u;
     tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_S0);
@@ -2115,7 +2175,6 @@  static void tcg_target_init(TCGContext *s)
     tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_S10);
     tcg_regset_reset_reg(tcg_target_call_clobber_regs, TCG_REG_S11);
 
-    s->reserved_regs = 0;
     tcg_regset_set_reg(s->reserved_regs, TCG_REG_ZERO);
     tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP0);
     tcg_regset_set_reg(s->reserved_regs, TCG_REG_TMP1);
@@ -2123,6 +2182,7 @@  static void tcg_target_init(TCGContext *s)
     tcg_regset_set_reg(s->reserved_regs, TCG_REG_SP);
     tcg_regset_set_reg(s->reserved_regs, TCG_REG_GP);
     tcg_regset_set_reg(s->reserved_regs, TCG_REG_TP);
+    tcg_regset_set_reg(s->reserved_regs, TCG_REG_V0);
 }
 
 typedef struct {
diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
index 1a347eaf6e..12a7a37aaa 100644
--- a/tcg/riscv/tcg-target.h
+++ b/tcg/riscv/tcg-target.h
@@ -28,42 +28,28 @@ 
 #include "host/cpuinfo.h"
 
 #define TCG_TARGET_INSN_UNIT_SIZE 4
-#define TCG_TARGET_NB_REGS 32
+#define TCG_TARGET_NB_REGS 64
 #define MAX_CODE_GEN_BUFFER_SIZE  ((size_t)-1)
 
 typedef enum {
-    TCG_REG_ZERO,
-    TCG_REG_RA,
-    TCG_REG_SP,
-    TCG_REG_GP,
-    TCG_REG_TP,
-    TCG_REG_T0,
-    TCG_REG_T1,
-    TCG_REG_T2,
-    TCG_REG_S0,
-    TCG_REG_S1,
-    TCG_REG_A0,
-    TCG_REG_A1,
-    TCG_REG_A2,
-    TCG_REG_A3,
-    TCG_REG_A4,
-    TCG_REG_A5,
-    TCG_REG_A6,
-    TCG_REG_A7,
-    TCG_REG_S2,
-    TCG_REG_S3,
-    TCG_REG_S4,
-    TCG_REG_S5,
-    TCG_REG_S6,
-    TCG_REG_S7,
-    TCG_REG_S8,
-    TCG_REG_S9,
-    TCG_REG_S10,
-    TCG_REG_S11,
-    TCG_REG_T3,
-    TCG_REG_T4,
-    TCG_REG_T5,
-    TCG_REG_T6,
+    TCG_REG_ZERO, TCG_REG_RA,  TCG_REG_SP,  TCG_REG_GP,
+    TCG_REG_TP,   TCG_REG_T0,  TCG_REG_T1,  TCG_REG_T2,
+    TCG_REG_S0,   TCG_REG_S1,  TCG_REG_A0,  TCG_REG_A1,
+    TCG_REG_A2,   TCG_REG_A3,  TCG_REG_A4,  TCG_REG_A5,
+    TCG_REG_A6,   TCG_REG_A7,  TCG_REG_S2,  TCG_REG_S3,
+    TCG_REG_S4,   TCG_REG_S5,  TCG_REG_S6,  TCG_REG_S7,
+    TCG_REG_S8,   TCG_REG_S9,  TCG_REG_S10, TCG_REG_S11,
+    TCG_REG_T3,   TCG_REG_T4,  TCG_REG_T5,  TCG_REG_T6,
+
+    /* RISC-V V Extension registers */
+    TCG_REG_V0,   TCG_REG_V1,  TCG_REG_V2,  TCG_REG_V3,
+    TCG_REG_V4,   TCG_REG_V5,  TCG_REG_V6,  TCG_REG_V7,
+    TCG_REG_V8,   TCG_REG_V9,  TCG_REG_V10, TCG_REG_V11,
+    TCG_REG_V12,  TCG_REG_V13, TCG_REG_V14, TCG_REG_V15,
+    TCG_REG_V16,  TCG_REG_V17, TCG_REG_V18, TCG_REG_V19,
+    TCG_REG_V20,  TCG_REG_V21, TCG_REG_V22, TCG_REG_V23,
+    TCG_REG_V24,  TCG_REG_V25, TCG_REG_V26, TCG_REG_V27,
+    TCG_REG_V28,  TCG_REG_V29, TCG_REG_V30, TCG_REG_V31,
 
     /* aliases */
     TCG_AREG0          = TCG_REG_S0,
@@ -156,6 +142,32 @@  typedef enum {
 
 #define TCG_TARGET_HAS_tst              0
 
+/* vector instructions */
+#define TCG_TARGET_HAS_v64              0
+#define TCG_TARGET_HAS_v128             0
+#define TCG_TARGET_HAS_v256             0
+#define TCG_TARGET_HAS_andc_vec         0
+#define TCG_TARGET_HAS_orc_vec          0
+#define TCG_TARGET_HAS_nand_vec         0
+#define TCG_TARGET_HAS_nor_vec          0
+#define TCG_TARGET_HAS_eqv_vec          0
+#define TCG_TARGET_HAS_not_vec          0
+#define TCG_TARGET_HAS_neg_vec          0
+#define TCG_TARGET_HAS_abs_vec          0
+#define TCG_TARGET_HAS_roti_vec         0
+#define TCG_TARGET_HAS_rots_vec         0
+#define TCG_TARGET_HAS_rotv_vec         0
+#define TCG_TARGET_HAS_shi_vec          0
+#define TCG_TARGET_HAS_shs_vec          0
+#define TCG_TARGET_HAS_shv_vec          0
+#define TCG_TARGET_HAS_mul_vec          0
+#define TCG_TARGET_HAS_sat_vec          0
+#define TCG_TARGET_HAS_minmax_vec       0
+#define TCG_TARGET_HAS_bitsel_vec       0
+#define TCG_TARGET_HAS_cmpsel_vec       0
+
+#define TCG_TARGET_HAS_tst_vec          0
+
 #define TCG_TARGET_DEFAULT_MO (0)
 
 #define TCG_TARGET_NEED_LDST_LABELS
diff --git a/tcg/riscv/tcg-target.opc.h b/tcg/riscv/tcg-target.opc.h
new file mode 100644
index 0000000000..b80b39e1e5
--- /dev/null
+++ b/tcg/riscv/tcg-target.opc.h
@@ -0,0 +1,12 @@ 
+/*
+ * Copyright (c) C-SKY Microsystems Co., Ltd.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * (at your option) any later version.
+ *
+ * See the COPYING file in the top-level directory for details.
+ *
+ * Target-specific opcodes for host vector expansion.  These will be
+ * emitted by tcg_expand_vec_op.  For those familiar with GCC internals,
+ * consider these to be UNSPEC with names.
+ */