mbox series

[0/8] i386: Opmitize code with AVX10.2 new instructions

Message ID 20240826064238.2268967-1-haochen.jiang@intel.com
Headers show
Series i386: Opmitize code with AVX10.2 new instructions | expand

Message

Haochen Jiang Aug. 26, 2024, 6:42 a.m. UTC
Hi all,

I have just commited AVX10.2 new instructions patches into trunk hours
ago. The next and final part for AVX10.2 upstream is to optimize code
with AVX10.2 new instructions.

In this patch series, it will contain the following optimizations:

  - VNNI instruction auto vectorize (PATCH 1).
  - Codegen optimization with new scalar comparison instructions to
    eliminate redundant code (PATCH 2-3).
  - BF16 instruction auto vectorize (PATCH 4-8).

This will finish the upstream for AVX10.2 series.

Afterwards, we may add V2BF/V4BF in another thread just like what we
have done for V2HF/V4HF when AVX512FP16 upstreamed.

Bootstrapped on x86-64-pc-linux-gnu. Ok for trunk?

Thx,
Haochen

Comments

Hongtao Liu Sept. 2, 2024, 2:06 a.m. UTC | #1
On Mon, Aug 26, 2024 at 2:43 PM Haochen Jiang <haochen.jiang@intel.com> wrote:
>
> Hi all,
>
> I have just commited AVX10.2 new instructions patches into trunk hours
> ago. The next and final part for AVX10.2 upstream is to optimize code
> with AVX10.2 new instructions.
>
> In this patch series, it will contain the following optimizations:
>
>   - VNNI instruction auto vectorize (PATCH 1).
>   - Codegen optimization with new scalar comparison instructions to
>     eliminate redundant code (PATCH 2-3).
>   - BF16 instruction auto vectorize (PATCH 4-8).
>
> This will finish the upstream for AVX10.2 series.
>
> Afterwards, we may add V2BF/V4BF in another thread just like what we
> have done for V2HF/V4HF when AVX512FP16 upstreamed.
>
> Bootstrapped on x86-64-pc-linux-gnu. Ok for trunk?
Ok for all 8 patches.
>
> Thx,
> Haochen
>
>