diff mbox series

aarch64: Add ACLE intrinsics for AdvSIMD faminmax

Message ID 20240722113937.959073-1-saurabh.jha@arm.com
State New
Headers show
Series aarch64: Add ACLE intrinsics for AdvSIMD faminmax | expand

Commit Message

Saurabh Jha July 22, 2024, 11:39 a.m. UTC
The AArch64 FEAT_FAMINMAX extension is optional in Armv9.2 and mandatory
in Armv9.5. It introduces instructions for computing the floating point
absolute maximum and minimum of the two vectors element-wise.

This patch introduces intrinsics for AdvSIMD faminmax extension in the
form of the following builtin-functions:
* vamax_f16
* vamaxq_f16
* vamax_f32
* vamaxq_f32
* vamaxq_f64
* vamin_f16
* vaminq_f16
* vamin_f32
* vaminq_f32
* vaminq_f64

gcc/ChangeLog:

        * config/aarch64/aarch64-builtins.cc
	(enum aarch64_builtins): New enum values for faminmax builtins.
        (aarch64_init_faminmax_builtins): New function to declare new
builtins.
        (handle_arm_neon_h): Modified to call
aarch64_init_faminmax_builtins.
        (aarch64_general_check_builtin_call): Modified to check whether
+faminmax flag is being used and printing error message if not used.
        (aarch64_expand_builtin_faminmax): New function to emit
instructions of this extension.
        (aarch64_general_expand_builtin): Modified to call aarch64_expand_builtin_faminmax.
        * config/aarch64/aarch64-option-extensions.def
	(AARCH64_OPT_EXTENSION): Introduce new flag for this extension.
        * config/aarch64/aarch64-simd.md
	(aarch64_<faminmax><mode>): Introduce instruction pattern for this extension.
        * config/aarch64/aarch64.h
	(TARGET_FAMINMAX): Introduce new flag for this extension.
        * config/aarch64/iterators.md: Introduce new iterators for this extension.
        * config/arm/types.md: Introduce neon_fp_aminmax<q> attributes.
        * doc/invoke.texi: Document extension in AArch64 Options.

gcc/testsuite/ChangeLog:

        * gcc.target/aarch64/simd/faminmax.c: New tests for this
	  extension.
---
Hi,

Regression tested for aarch64-none-linux-gnu and found no regressions.

This patch is dependent on the patch series "Extend
aarch64_feature_flags to 128 bits" which is under review. This patch
should be commited only after that patch series is commited. I am
raising this patch now for early feedback.

Ok for master? I don't have commit access so can someone please commit
on my behalf?

Regards,
Saurabh
---
 gcc/config/aarch64/aarch64-builtins.cc        | 150 ++++++++++++++++--
 .../aarch64/aarch64-option-extensions.def     |   2 +
 gcc/config/aarch64/aarch64-simd.md            |  11 ++
 gcc/config/aarch64/aarch64.h                  |   4 +
 gcc/config/aarch64/iterators.md               |  10 ++
 gcc/config/arm/types.md                       |   6 +
 gcc/doc/invoke.texi                           |   2 +
 .../gcc.target/aarch64/simd/faminmax.c        |  40 +++++
 8 files changed, 216 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/faminmax.c

Comments

Kyrylo Tkachov July 22, 2024, 11:57 a.m. UTC | #1
Hi Saurabh,

> On 22 Jul 2024, at 13:39, saurabh.jha@arm.com wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> The AArch64 FEAT_FAMINMAX extension is optional in Armv9.2 and mandatory
> in Armv9.5. It introduces instructions for computing the floating point
> absolute maximum and minimum of the two vectors element-wise.
> 
> This patch introduces intrinsics for AdvSIMD faminmax extension in the
> form of the following builtin-functions:
> * vamax_f16
> * vamaxq_f16
> * vamax_f32
> * vamaxq_f32
> * vamaxq_f64
> * vamin_f16
> * vaminq_f16
> * vamin_f32
> * vaminq_f32
> * vaminq_f64
> 
> gcc/ChangeLog:
> 
>        * config/aarch64/aarch64-builtins.cc
>        (enum aarch64_builtins): New enum values for faminmax builtins.
>        (aarch64_init_faminmax_builtins): New function to declare new
> builtins.
>        (handle_arm_neon_h): Modified to call
> aarch64_init_faminmax_builtins.
>        (aarch64_general_check_builtin_call): Modified to check whether
> +faminmax flag is being used and printing error message if not used.
>        (aarch64_expand_builtin_faminmax): New function to emit
> instructions of this extension.
>        (aarch64_general_expand_builtin): Modified to call aarch64_expand_builtin_faminmax.
>        * config/aarch64/aarch64-option-extensions.def
>        (AARCH64_OPT_EXTENSION): Introduce new flag for this extension.
>        * config/aarch64/aarch64-simd.md
>        (aarch64_<faminmax><mode>): Introduce instruction pattern for this extension.
>        * config/aarch64/aarch64.h
>        (TARGET_FAMINMAX): Introduce new flag for this extension.
>        * config/aarch64/iterators.md: Introduce new iterators for this extension.
>        * config/arm/types.md: Introduce neon_fp_aminmax<q> attributes.
>        * doc/invoke.texi: Document extension in AArch64 Options.
> 
> gcc/testsuite/ChangeLog:
> 
>        * gcc.target/aarch64/simd/faminmax.c: New tests for this
>          extension.
> ---
> Hi,
> 
> Regression tested for aarch64-none-linux-gnu and found no regressions.
> 
> This patch is dependent on the patch series "Extend
> aarch64_feature_flags to 128 bits" which is under review. This patch
> should be commited only after that patch series is commited. I am
> raising this patch now for early feedback.
> 
> Ok for master? I don't have commit access so can someone please commit
> on my behalf?
> 
> Regards,
> Saurabh
> ---
> gcc/config/aarch64/aarch64-builtins.cc        | 150 ++++++++++++++++--
> .../aarch64/aarch64-option-extensions.def     |   2 +
> gcc/config/aarch64/aarch64-simd.md            |  11 ++
> gcc/config/aarch64/aarch64.h                  |   4 +
> gcc/config/aarch64/iterators.md               |  10 ++
> gcc/config/arm/types.md                       |   6 +
> gcc/doc/invoke.texi                           |   2 +
> .../gcc.target/aarch64/simd/faminmax.c        |  40 +++++
> 8 files changed, 216 insertions(+), 9 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/faminmax.c
> 
> <0001-aarch64-Add-ACLE-intrinsics-for-AdvSIMD-faminmax.patch>

@@ -2197,15 +2269,34 @@ aarch64_general_check_builtin_call (location_t location, vec<location_t>,
case AARCH64_WSR64:
case AARCH64_WSRF:
case AARCH64_WSRF64:
- tree addr = STRIP_NOPS (args[0]);
- if (TREE_CODE (TREE_TYPE (addr)) != POINTER_TYPE
- || TREE_CODE (addr) != ADDR_EXPR
- || TREE_CODE (TREE_OPERAND (addr, 0)) != STRING_CST)
- {
- error_at (location, "first argument to %qD must be a string literal",
- fndecl);
- return false;
- }
+ {
+ tree addr = STRIP_NOPS (args[0]);
+ if (TREE_CODE (TREE_TYPE (addr)) != POINTER_TYPE
+ || TREE_CODE (addr) != ADDR_EXPR
+ || TREE_CODE (TREE_OPERAND (addr, 0)) != STRING_CST)
+ {
+ error_at (location,
+ "first argument to %qD must be a string literal",
+ fndecl);
+ return false;
+ }
+ }
+ case AARCH64_FAMINMAX_BUILTIN_FAMAX8H:
+ case AARCH64_FAMINMAX_BUILTIN_FAMAX2S:
+ case AARCH64_FAMINMAX_BUILTIN_FAMAX4S:
+ case AARCH64_FAMINMAX_BUILTIN_FAMAX2D:
+ case AARCH64_FAMINMAX_BUILTIN_FAMIN4H:
+ case AARCH64_FAMINMAX_BUILTIN_FAMIN8H:
+ case AARCH64_FAMINMAX_BUILTIN_FAMIN2S:
+ case AARCH64_FAMINMAX_BUILTIN_FAMIN4S:
+ case AARCH64_FAMINMAX_BUILTIN_FAMIN2D:
+ {
+ if (!TARGET_FAMINMAX)
+ {
+ error_at (location, "need +faminmax flag to call %qD", fndecl);

I’d rather keep these error messages consistent around the backend.
Can we reuse report_missing_extension from aarch64-sve-builtins.cc <http://aarch64-sve-builtins.cc/> somehow to have the “intrinsic requires +foo extension” reporting in one place?


diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index bbeee221f37..c769a482312 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -9881,3 +9881,14 @@
"shl\\t%d0, %d1, #16"
[(set_attr "type" "neon_shift_imm")]
)
+
+;; faminmax instruction patterns
+(define_insn "aarch64_<faminmax><mode>"
+ [(set (match_operand:VHSDF 0 "register_operand" "=w")
+ (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand" "w")
+ (match_operand:VHSDF 2 "register_operand" "w")]
+ FAMINMAX_UNS))]
+ "TARGET_FAMINMAX"
+ "<faminmax>\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>"
+ [(set_attr "type" "neon_fp_aminmax<q>")]
+)

Any reason why we can’t represent these using organic RTL codes for smin/smax and abs?
It would allow the RTL optimizers to more easily see through the semantics.
Or is the NaN/Inf handling of these instructions inconsistent with the RTL semantics?

Thanks,
Kyrill
Saurabh Jha Aug. 1, 2024, 9:41 a.m. UTC | #2
Hi Kyrill,

Thank you for the review. I have addressed all the comments here: 
https://gcc.gnu.org/pipermail/gcc-patches/2024-August/658968.html

Thanks,
Saurabh

On 7/22/2024 12:57 PM, Kyrylo Tkachov wrote:
> Hi Saurabh,
> 
>> On 22 Jul 2024, at 13:39, saurabh.jha@arm.com wrote:
>>
>> External email: Use caution opening links or attachments
>>
>>
>> The AArch64 FEAT_FAMINMAX extension is optional in Armv9.2 and mandatory
>> in Armv9.5. It introduces instructions for computing the floating point
>> absolute maximum and minimum of the two vectors element-wise.
>>
>> This patch introduces intrinsics for AdvSIMD faminmax extension in the
>> form of the following builtin-functions:
>> * vamax_f16
>> * vamaxq_f16
>> * vamax_f32
>> * vamaxq_f32
>> * vamaxq_f64
>> * vamin_f16
>> * vaminq_f16
>> * vamin_f32
>> * vaminq_f32
>> * vaminq_f64
>>
>> gcc/ChangeLog:
>>
>>         * config/aarch64/aarch64-builtins.cc
>>         (enum aarch64_builtins): New enum values for faminmax builtins.
>>         (aarch64_init_faminmax_builtins): New function to declare new
>> builtins.
>>         (handle_arm_neon_h): Modified to call
>> aarch64_init_faminmax_builtins.
>>         (aarch64_general_check_builtin_call): Modified to check whether
>> +faminmax flag is being used and printing error message if not used.
>>         (aarch64_expand_builtin_faminmax): New function to emit
>> instructions of this extension.
>>         (aarch64_general_expand_builtin): Modified to call aarch64_expand_builtin_faminmax.
>>         * config/aarch64/aarch64-option-extensions.def
>>         (AARCH64_OPT_EXTENSION): Introduce new flag for this extension.
>>         * config/aarch64/aarch64-simd.md
>>         (aarch64_<faminmax><mode>): Introduce instruction pattern for this extension.
>>         * config/aarch64/aarch64.h
>>         (TARGET_FAMINMAX): Introduce new flag for this extension.
>>         * config/aarch64/iterators.md: Introduce new iterators for this extension.
>>         * config/arm/types.md: Introduce neon_fp_aminmax<q> attributes.
>>         * doc/invoke.texi: Document extension in AArch64 Options.
>>
>> gcc/testsuite/ChangeLog:
>>
>>         * gcc.target/aarch64/simd/faminmax.c: New tests for this
>>           extension.
>> ---
>> Hi,
>>
>> Regression tested for aarch64-none-linux-gnu and found no regressions.
>>
>> This patch is dependent on the patch series "Extend
>> aarch64_feature_flags to 128 bits" which is under review. This patch
>> should be commited only after that patch series is commited. I am
>> raising this patch now for early feedback.
>>
>> Ok for master? I don't have commit access so can someone please commit
>> on my behalf?
>>
>> Regards,
>> Saurabh
>> ---
>> gcc/config/aarch64/aarch64-builtins.cc        | 150 ++++++++++++++++--
>> .../aarch64/aarch64-option-extensions.def     |   2 +
>> gcc/config/aarch64/aarch64-simd.md            |  11 ++
>> gcc/config/aarch64/aarch64.h                  |   4 +
>> gcc/config/aarch64/iterators.md               |  10 ++
>> gcc/config/arm/types.md                       |   6 +
>> gcc/doc/invoke.texi                           |   2 +
>> .../gcc.target/aarch64/simd/faminmax.c        |  40 +++++
>> 8 files changed, 216 insertions(+), 9 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/faminmax.c
>>
>> <0001-aarch64-Add-ACLE-intrinsics-for-AdvSIMD-faminmax.patch>
> 
> @@ -2197,15 +2269,34 @@ aarch64_general_check_builtin_call (location_t location, vec<location_t>,
> case AARCH64_WSR64:
> case AARCH64_WSRF:
> case AARCH64_WSRF64:
> - tree addr = STRIP_NOPS (args[0]);
> - if (TREE_CODE (TREE_TYPE (addr)) != POINTER_TYPE
> - || TREE_CODE (addr) != ADDR_EXPR
> - || TREE_CODE (TREE_OPERAND (addr, 0)) != STRING_CST)
> - {
> - error_at (location, "first argument to %qD must be a string literal",
> - fndecl);
> - return false;
> - }
> + {
> + tree addr = STRIP_NOPS (args[0]);
> + if (TREE_CODE (TREE_TYPE (addr)) != POINTER_TYPE
> + || TREE_CODE (addr) != ADDR_EXPR
> + || TREE_CODE (TREE_OPERAND (addr, 0)) != STRING_CST)
> + {
> + error_at (location,
> + "first argument to %qD must be a string literal",
> + fndecl);
> + return false;
> + }
> + }
> + case AARCH64_FAMINMAX_BUILTIN_FAMAX8H:
> + case AARCH64_FAMINMAX_BUILTIN_FAMAX2S:
> + case AARCH64_FAMINMAX_BUILTIN_FAMAX4S:
> + case AARCH64_FAMINMAX_BUILTIN_FAMAX2D:
> + case AARCH64_FAMINMAX_BUILTIN_FAMIN4H:
> + case AARCH64_FAMINMAX_BUILTIN_FAMIN8H:
> + case AARCH64_FAMINMAX_BUILTIN_FAMIN2S:
> + case AARCH64_FAMINMAX_BUILTIN_FAMIN4S:
> + case AARCH64_FAMINMAX_BUILTIN_FAMIN2D:
> + {
> + if (!TARGET_FAMINMAX)
> + {
> + error_at (location, "need +faminmax flag to call %qD", fndecl);
> 
> I’d rather keep these error messages consistent around the backend.
> Can we reuse report_missing_extension from aarch64-sve-builtins.cc <http://aarch64-sve-builtins.cc/> somehow to have the “intrinsic requires +foo extension” reporting in one place?
> 
> 
> diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
> index bbeee221f37..c769a482312 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -9881,3 +9881,14 @@
> "shl\\t%d0, %d1, #16"
> [(set_attr "type" "neon_shift_imm")]
> )
> +
> +;; faminmax instruction patterns
> +(define_insn "aarch64_<faminmax><mode>"
> + [(set (match_operand:VHSDF 0 "register_operand" "=w")
> + (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand" "w")
> + (match_operand:VHSDF 2 "register_operand" "w")]
> + FAMINMAX_UNS))]
> + "TARGET_FAMINMAX"
> + "<faminmax>\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>"
> + [(set_attr "type" "neon_fp_aminmax<q>")]
> +)
> 
> Any reason why we can’t represent these using organic RTL codes for smin/smax and abs?
> It would allow the RTL optimizers to more easily see through the semantics.
> Or is the NaN/Inf handling of these instructions inconsistent with the RTL semantics?
> 
> Thanks,
> Kyrill
diff mbox series

Patch

diff --git a/gcc/config/aarch64/aarch64-builtins.cc b/gcc/config/aarch64/aarch64-builtins.cc
index 30669f8aa18..b3d8cf22eeb 100644
--- a/gcc/config/aarch64/aarch64-builtins.cc
+++ b/gcc/config/aarch64/aarch64-builtins.cc
@@ -829,6 +829,17 @@  enum aarch64_builtins
   AARCH64_RBIT,
   AARCH64_RBITL,
   AARCH64_RBITLL,
+  /* FAMINMAX builtins.  */
+  AARCH64_FAMINMAX_BUILTIN_FAMAX4H,
+  AARCH64_FAMINMAX_BUILTIN_FAMAX8H,
+  AARCH64_FAMINMAX_BUILTIN_FAMAX2S,
+  AARCH64_FAMINMAX_BUILTIN_FAMAX4S,
+  AARCH64_FAMINMAX_BUILTIN_FAMAX2D,
+  AARCH64_FAMINMAX_BUILTIN_FAMIN4H,
+  AARCH64_FAMINMAX_BUILTIN_FAMIN8H,
+  AARCH64_FAMINMAX_BUILTIN_FAMIN2S,
+  AARCH64_FAMINMAX_BUILTIN_FAMIN4S,
+  AARCH64_FAMINMAX_BUILTIN_FAMIN2D,
   /* System register builtins.  */
   AARCH64_RSR,
   AARCH64_RSRP,
@@ -1547,6 +1558,66 @@  aarch64_init_simd_builtin_functions (bool called_from_pragma)
     }
 }
 
+/* Initialize the absolute maximum/minimum (FAMINMAX) builtins.  */
+
+typedef struct
+{
+  const char *name;
+  unsigned int code;
+  tree eltype;
+  machine_mode mode;
+} faminmax_builtins_data;
+
+static void
+aarch64_init_faminmax_builtins ()
+{
+  faminmax_builtins_data data[] = {
+    /* Absolute maximum.  */
+    {"vamax_f16", AARCH64_FAMINMAX_BUILTIN_FAMAX4H,
+     aarch64_simd_types[Float16x4_t].eltype,
+     aarch64_simd_types[Float16x4_t].mode},
+    {"vamaxq_f16", AARCH64_FAMINMAX_BUILTIN_FAMAX8H,
+     aarch64_simd_types[Float16x8_t].eltype,
+     aarch64_simd_types[Float16x8_t].mode},
+    {"vamax_f32", AARCH64_FAMINMAX_BUILTIN_FAMAX2S,
+     aarch64_simd_types[Float32x2_t].eltype,
+     aarch64_simd_types[Float32x2_t].mode},
+    {"vamaxq_f32", AARCH64_FAMINMAX_BUILTIN_FAMAX4S,
+     aarch64_simd_types[Float32x4_t].eltype,
+     aarch64_simd_types[Float32x4_t].mode},
+    {"vamaxq_f64", AARCH64_FAMINMAX_BUILTIN_FAMAX2D,
+     aarch64_simd_types[Float64x2_t].eltype,
+     aarch64_simd_types[Float64x2_t].mode},
+    /* Absolute maximum.  */
+    {"vamin_f16", AARCH64_FAMINMAX_BUILTIN_FAMIN4H,
+     aarch64_simd_types[Float16x4_t].eltype,
+     aarch64_simd_types[Float16x4_t].mode},
+    {"vaminq_f16", AARCH64_FAMINMAX_BUILTIN_FAMIN8H,
+     aarch64_simd_types[Float16x8_t].eltype,
+     aarch64_simd_types[Float16x8_t].mode},
+    {"vamin_f32", AARCH64_FAMINMAX_BUILTIN_FAMIN2S,
+     aarch64_simd_types[Float32x2_t].eltype,
+     aarch64_simd_types[Float32x2_t].mode},
+    {"vaminq_f32", AARCH64_FAMINMAX_BUILTIN_FAMIN4S,
+     aarch64_simd_types[Float32x4_t].eltype,
+     aarch64_simd_types[Float32x4_t].mode},
+    {"vaminq_f64", AARCH64_FAMINMAX_BUILTIN_FAMIN2D,
+     aarch64_simd_types[Float64x2_t].eltype,
+     aarch64_simd_types[Float64x2_t].mode},
+  };
+
+  for (size_t i = 0; i < ARRAY_SIZE (data); ++i)
+    {
+      tree type
+	= build_vector_type (data[i].eltype, GET_MODE_NUNITS (data[i].mode));
+      tree fntype = build_function_type_list (type, type, type, NULL_TREE);
+      unsigned int code = data[i].code;
+      const char *name = data[i].name;
+      aarch64_builtin_decls[code]
+	= aarch64_general_simulate_builtin (name, fntype, code);
+    }
+}
+
 /* Register the tuple type that contains NUM_VECTORS of the AdvSIMD type
    indexed by TYPE_INDEX.  */
 static void
@@ -1640,6 +1711,7 @@  handle_arm_neon_h (void)
 
   aarch64_init_simd_builtin_functions (true);
   aarch64_init_simd_intrinsics ();
+  aarch64_init_faminmax_builtins ();
 }
 
 static void
@@ -2197,15 +2269,34 @@  aarch64_general_check_builtin_call (location_t location, vec<location_t>,
     case AARCH64_WSR64:
     case AARCH64_WSRF:
     case AARCH64_WSRF64:
-      tree addr = STRIP_NOPS (args[0]);
-      if (TREE_CODE (TREE_TYPE (addr)) != POINTER_TYPE
-	  || TREE_CODE (addr) != ADDR_EXPR
-	  || TREE_CODE (TREE_OPERAND (addr, 0)) != STRING_CST)
-	{
-	  error_at (location, "first argument to %qD must be a string literal",
-		    fndecl);
-	  return false;
-	}
+      {
+	tree addr = STRIP_NOPS (args[0]);
+	if (TREE_CODE (TREE_TYPE (addr)) != POINTER_TYPE
+	    || TREE_CODE (addr) != ADDR_EXPR
+	    || TREE_CODE (TREE_OPERAND (addr, 0)) != STRING_CST)
+	  {
+	    error_at (location,
+		      "first argument to %qD must be a string literal",
+		      fndecl);
+	    return false;
+	  }
+      }
+    case AARCH64_FAMINMAX_BUILTIN_FAMAX8H:
+    case AARCH64_FAMINMAX_BUILTIN_FAMAX2S:
+    case AARCH64_FAMINMAX_BUILTIN_FAMAX4S:
+    case AARCH64_FAMINMAX_BUILTIN_FAMAX2D:
+    case AARCH64_FAMINMAX_BUILTIN_FAMIN4H:
+    case AARCH64_FAMINMAX_BUILTIN_FAMIN8H:
+    case AARCH64_FAMINMAX_BUILTIN_FAMIN2S:
+    case AARCH64_FAMINMAX_BUILTIN_FAMIN4S:
+    case AARCH64_FAMINMAX_BUILTIN_FAMIN2D:
+      {
+	if (!TARGET_FAMINMAX)
+	  {
+	    error_at (location, "need +faminmax flag to call %qD", fndecl);
+	    return false;
+	  }
+      }
     }
   /* Default behavior.  */
   return true;
@@ -3071,6 +3162,44 @@  aarch64_expand_builtin_data_intrinsic (unsigned int fcode, tree exp, rtx target)
   return ops[0].value;
 }
 
+static rtx
+aarch64_expand_builtin_faminmax (unsigned int fcode, tree exp, rtx target)
+{
+  machine_mode mode = TYPE_MODE (TREE_TYPE (exp));
+  rtx op0 = force_reg (mode, expand_normal (CALL_EXPR_ARG (exp, 0)));
+  rtx op1 = force_reg (mode, expand_normal (CALL_EXPR_ARG (exp, 1)));
+
+  enum insn_code icode;
+  if (fcode == AARCH64_FAMINMAX_BUILTIN_FAMAX4H)
+    icode = CODE_FOR_aarch64_famaxv4hf;
+  else if (fcode == AARCH64_FAMINMAX_BUILTIN_FAMAX8H)
+    icode = CODE_FOR_aarch64_famaxv8hf;
+  else if (fcode == AARCH64_FAMINMAX_BUILTIN_FAMAX2S)
+    icode = CODE_FOR_aarch64_famaxv2sf;
+  else if (fcode == AARCH64_FAMINMAX_BUILTIN_FAMAX4S)
+    icode = CODE_FOR_aarch64_famaxv4sf;
+  else if (fcode == AARCH64_FAMINMAX_BUILTIN_FAMAX2D)
+    icode = CODE_FOR_aarch64_famaxv2df;
+  else if (fcode == AARCH64_FAMINMAX_BUILTIN_FAMIN4H)
+    icode = CODE_FOR_aarch64_faminv4hf;
+  else if (fcode == AARCH64_FAMINMAX_BUILTIN_FAMIN8H)
+    icode = CODE_FOR_aarch64_faminv8hf;
+  else if (fcode == AARCH64_FAMINMAX_BUILTIN_FAMIN2S)
+    icode = CODE_FOR_aarch64_faminv2sf;
+  else if (fcode == AARCH64_FAMINMAX_BUILTIN_FAMIN4S)
+    icode = CODE_FOR_aarch64_faminv4sf;
+  else if (fcode == AARCH64_FAMINMAX_BUILTIN_FAMIN2D)
+    icode = CODE_FOR_aarch64_faminv2df;
+  else
+    gcc_unreachable ();
+
+  rtx pat = GEN_FCN (icode) (target, op0, op1);
+
+  emit_insn (pat);
+
+  return target;
+}
+
 /* Expand an expression EXP as fpsr or fpcr setter (depending on
    UNSPEC) using MODE.  */
 static void
@@ -3250,6 +3379,9 @@  aarch64_general_expand_builtin (unsigned int fcode, tree exp, rtx target,
   if (fcode >= AARCH64_REV16
       && fcode <= AARCH64_RBITLL)
     return aarch64_expand_builtin_data_intrinsic (fcode, exp, target);
+  if (fcode >= AARCH64_FAMINMAX_BUILTIN_FAMAX4H
+      && fcode <= AARCH64_FAMINMAX_BUILTIN_FAMIN2D)
+    return aarch64_expand_builtin_faminmax (fcode, exp, target);
 
   gcc_unreachable ();
 }
diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
index 42ec0eec31e..e95bd70893a 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -232,6 +232,8 @@  AARCH64_OPT_EXTENSION("the", THE, (), (), (), "the")
 
 AARCH64_OPT_EXTENSION("gcs", GCS, (), (), (), "gcs")
 
+AARCH64_OPT_EXTENSION("faminmax", FAMINMAX, (SIMD), (), (), "faminmax")
+
 #undef AARCH64_OPT_FMV_EXTENSION
 #undef AARCH64_OPT_EXTENSION
 #undef AARCH64_FMV_FEATURE
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index bbeee221f37..c769a482312 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -9881,3 +9881,14 @@ 
   "shl\\t%d0, %d1, #16"
   [(set_attr "type" "neon_shift_imm")]
 )
+
+;; faminmax instruction patterns
+(define_insn "aarch64_<faminmax><mode>"
+  [(set (match_operand:VHSDF 0 "register_operand" "=w")
+	(unspec:VHSDF [(match_operand:VHSDF 1 "register_operand" "w")
+		       (match_operand:VHSDF 2 "register_operand" "w")]
+		      FAMINMAX_UNS))]
+  "TARGET_FAMINMAX"
+  "<faminmax>\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>"
+  [(set_attr "type" "neon_fp_aminmax<q>")]
+)
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 8056c337957..c6773f64745 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -456,6 +456,10 @@  constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED
     enabled through +gcs.  */
 #define TARGET_GCS AARCH64_HAVE_ISA (GCS)
 
+/*  Floating Point Absolute Maximum/Minimum extension instructions are
+    enabled through +faminmax.  */
+#define TARGET_FAMINMAX AARCH64_HAVE_ISA (FAMINMAX)
+
 /* Prefer different predicate registers for the output of a predicated
    operation over re-using an existing input predicate.  */
 #define TARGET_SVE_PRED_CLOBBER (TARGET_SVE \
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 95fe8f070f4..297e1b8e9d9 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -1111,6 +1111,10 @@ 
     UNSPEC_SME_WRITE
     UNSPEC_SME_WRITE_HOR
     UNSPEC_SME_WRITE_VER
+
+    ;; Used in faminmax patterns
+    UNSPEC_FAMAX
+    UNSPEC_FAMIN
 ])
 
 ;; ------------------------------------------------------------------
@@ -4457,3 +4461,9 @@ 
    (UNSPECV_SET_FPCR "fpcr")])
 
 (define_int_attr bits_etype [(8 "b") (16 "h") (32 "s") (64 "d")])
+
+;; Iterators and attributes for faminmax
+
+(define_int_iterator FAMINMAX_UNS [UNSPEC_FAMAX UNSPEC_FAMIN])
+
+(define_int_attr faminmax [(UNSPEC_FAMAX "famax") (UNSPEC_FAMIN "famin")])
diff --git a/gcc/config/arm/types.md b/gcc/config/arm/types.md
index 9527bdb9e87..d8de9dbc9d1 100644
--- a/gcc/config/arm/types.md
+++ b/gcc/config/arm/types.md
@@ -492,6 +492,8 @@ 
 ; neon_fp_reduc_minmax_s_q
 ; neon_fp_reduc_minmax_d
 ; neon_fp_reduc_minmax_d_q
+; neon_fp_aminmax
+; neon_fp_aminmax_q
 ; neon_fp_cvt_narrow_s_q
 ; neon_fp_cvt_narrow_d_q
 ; neon_fp_cvt_widen_h
@@ -1044,6 +1046,8 @@ 
   neon_fp_reduc_minmax_d,\
   neon_fp_reduc_minmax_d_q,\
 \
+  neon_fp_aminmax,\
+  neon_fp_aminmax_q,\
   neon_fp_cvt_narrow_s_q,\
   neon_fp_cvt_narrow_d_q,\
   neon_fp_cvt_widen_h,\
@@ -1264,6 +1268,8 @@ 
           neon_fp_reduc_add_d_q, neon_fp_reduc_minmax_s,
           neon_fp_reduc_minmax_s_q, neon_fp_reduc_minmax_d,\
           neon_fp_reduc_minmax_d_q,\
+	  neon_fp_aminmax, neon_fp_aminmax_q,\
+          neon_fp_aminmax, neon_fp_aminmax_q,\
           neon_fp_cvt_narrow_s_q, neon_fp_cvt_narrow_d_q,\
           neon_fp_cvt_widen_h, neon_fp_cvt_widen_s, neon_fp_to_int_s,\
           neon_fp_to_int_s_q, neon_int_to_fp_s, neon_int_to_fp_s_q,\
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 4850c7379bf..d48516f4f60 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -21777,6 +21777,8 @@  Enable support for Armv9.4-a Guarded Control Stack extension.
 Enable support for Armv8.9-a/9.4-a translation hardening extension.
 @item rcpc3
 Enable the RCpc3 (Release Consistency) extension.
+@item faminmax
+Enable the Floating Point Absolute Maximum/Minimum extension.
 
 @end table
 
diff --git a/gcc/testsuite/gcc.target/aarch64/simd/faminmax.c b/gcc/testsuite/gcc.target/aarch64/simd/faminmax.c
new file mode 100644
index 00000000000..52eafce1b5e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/simd/faminmax.c
@@ -0,0 +1,40 @@ 
+/* { dg-do assemble} */
+/* { dg-additional-options "-march=armv9-a+faminmax" } */
+
+#include "arm_neon.h"
+
+float16x4_t
+test_vamax_f16 (float16x4_t a, float16x4_t b)
+{
+  return vamax_f16 (a, b);
+}
+
+float16x8_t
+test_vamaxq_f16 (float16x8_t a, float16x8_t b)
+{
+  return vamaxq_f16 (a, b);
+}
+
+float32x2_t
+test_vamax_f32 (float32x2_t a, float32x2_t b)
+{
+  return vamax_f32 (a, b);
+}
+
+float32x4_t
+test_vamaxq_f32 (float32x4_t a, float32x4_t b)
+{
+  return vamaxq_f32 (a, b);
+}
+
+float64x2_t
+test_vamaxq_f64 (float64x2_t a, float64x2_t b)
+{
+  return vamaxq_f64 (a, b);
+}
+
+/* { dg-final { scan-assembler-times {\tfamax\tv[0-9]+.4h, v[0-9]+.4h, v[0-9]+.4h} 1 } } */
+/* { dg-final { scan-assembler-times {\tfamax\tv[0-9]+.8h, v[0-9]+.8h, v[0-9]+.8h} 1 } } */
+/* { dg-final { scan-assembler-times {\tfamax\tv[0-9]+.2s, v[0-9]+.2s, v[0-9]+.2s} 1 } } */
+/* { dg-final { scan-assembler-times {\tfamax\tv[0-9]+.4s, v[0-9]+.4s, v[0-9]+.4s} 1 } } */
+/* { dg-final { scan-assembler-times {\tfamax\tv[0-9]+.2d, v[0-9]+.2d, v[0-9]+.2d} 1 } } */