mbox series

[v2,0/2] tcg: Streamline vector load/store

Message ID 20231013175109.124308-1-richard.henderson@linaro.org
Headers show
Series tcg: Streamline vector load/store | expand

Message

Richard Henderson Oct. 13, 2023, 5:51 p.m. UTC
We have tcg_gen_qemu_{ld,st}_i128, which can be used to implement
load/store of vectors to guest memory.  But at present we have to
split into, or concatenated from, two i64 to reference the guest
vector register backing store within env.

Provide tcg_gen_{ld,st}_i128, which can avoid the trip through i64.

This does require that the target store i128 in host byte ordering,
which is true of i386 (and some other backends) but not arm or s390x.
There is definitely further cleanup possible.

Changes for v2:
  * Set atomicity for x86 vector operations.


r~


Richard Henderson (2):
  tcg: Add tcg_gen_{ld,st}_i128
  target/i386: Use i128 for 128 and 256-bit loads and stores

 include/tcg/tcg-op-common.h |  3 ++
 target/i386/tcg/translate.c | 63 +++++++++++++++++--------------------
 tcg/tcg-op.c                | 22 +++++++++++++
 3 files changed, 54 insertions(+), 34 deletions(-)

Comments

Philippe Mathieu-Daudé Oct. 17, 2023, 11:52 a.m. UTC | #1
On 13/10/23 19:51, Richard Henderson wrote:
> We have tcg_gen_qemu_{ld,st}_i128, which can be used to implement
> load/store of vectors to guest memory.  But at present we have to
> split into, or concatenated from, two i64 to reference the guest
> vector register backing store within env.
> 
> Provide tcg_gen_{ld,st}_i128, which can avoid the trip through i64.
> 
> This does require that the target store i128 in host byte ordering,
> which is true of i386 (and some other backends) but not arm or s390x.
> There is definitely further cleanup possible.

Is hexagon gen_vreg_load() candidate?
Richard Henderson Oct. 17, 2023, 1:38 p.m. UTC | #2
On 10/17/23 04:52, Philippe Mathieu-Daudé wrote:
> On 13/10/23 19:51, Richard Henderson wrote:
>> We have tcg_gen_qemu_{ld,st}_i128, which can be used to implement
>> load/store of vectors to guest memory.  But at present we have to
>> split into, or concatenated from, two i64 to reference the guest
>> vector register backing store within env.
>>
>> Provide tcg_gen_{ld,st}_i128, which can avoid the trip through i64.
>>
>> This does require that the target store i128 in host byte ordering,
>> which is true of i386 (and some other backends) but not arm or s390x.
>> There is definitely further cleanup possible.
> 
> Is hexagon gen_vreg_load() candidate?

Yes.


r~