From patchwork Fri Oct 18 15:01:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mariam Arutunian X-Patchwork-Id: 1999253 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=KYgRDPT3; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVSdM6M64z1xth for ; Sat, 19 Oct 2024 02:03:35 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1657A3857C7A for ; Fri, 18 Oct 2024 15:03:34 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lj1-x234.google.com (mail-lj1-x234.google.com [IPv6:2a00:1450:4864:20::234]) by sourceware.org (Postfix) with ESMTPS id 9899E3857C7F for ; Fri, 18 Oct 2024 15:01:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9899E3857C7F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9899E3857C7F Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::234 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729263708; cv=none; b=JXgRX1VCTqUpHNcNQrBKlojzxdEwhBlpVE1lKR5tNndjJSUiTwqSFWLp431oNohwUP52MGY3ngnEcortp+2M7ZiXYfbSdI7VICiPf+GWrV79IavDc8NHR3z9p9ZMhjMsFLueX88lT+Cy2ZZN71MrX1p8XXqrBuKcwJttFkfmJX8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729263708; c=relaxed/simple; bh=VeMrbWvgOCfqfd5KHXdlkdd09KP9Ot44VnI1QELvAcI=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=s2/+hRidWxmxsRMu0RbnjbZ4s7jAl2yVKHMewSPzR5hSQ5KuHJKhfZbTDaH7DKwnHH32WIZFUO7hy3RpDumAThH9Bx3ZDNve0MmEEpWO2QLWp/eIo0y3HNwvoetnp8j00WPmcMttjMcVL6P033Jb6YVvlLibmOP68EFVifv5bNQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lj1-x234.google.com with SMTP id 38308e7fff4ca-2fb4af0b6beso37222091fa.3 for ; Fri, 18 Oct 2024 08:01:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729263694; x=1729868494; darn=gcc.gnu.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=tWiuObWZxalyxcz6stY+FhNPYgX7pQQrwL/PaBbHSFk=; b=KYgRDPT3wpil4pF7ZHidxGgVX1LhCFw6alme8uc1uKWfoEB7KwdQZp5IWZbwwk2+2V nRtFSMHbof2dJxURcTYBnCyqhwgP1oNTiRRP+mAtXI+Jr7UeZLPGxHTSizUfkROKK/Fs 99Au5Grc0a7nahKKHemmmoLLZXcJwfc+kiGktrPWHuunxcLIXy7VzgvHebojAPKe3PIZ RHvwjKuZfJREKWvpkCWMmQq2c5RG98YspnXnNTeW9CNsB6fofQ/6C64pJML/ifWt81by HjGXOcOOl87JD2mjlLDVm9nAaq7bGz8N4wtHzxp/KfxeZdiF0M4VyEXtxDdIH6/3QS05 NVDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729263694; x=1729868494; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=tWiuObWZxalyxcz6stY+FhNPYgX7pQQrwL/PaBbHSFk=; b=LVXGYW6RF+vuHz7YdBMaulysm9J1CiHnljLlvVJuPQiWhzIiMdL4ksrPNJ4e96vu+F EvClp2P7Sh0ZZknQIEZe5aDjRKrhkoCHwMbI2m8RDxh8SnuDefpzS/ZePILoK6NR+3tj 5LvSXM61YjYmZ9S8jsrXutnP+YHKwLPT88d9RD2noWd1+3NLxRVVYAy3nPjji7xB6A3A FQZUuXHo3Oilu0ncFOoQfcs4A0glCncyInP3p/xu2au/VJbnSl5sjTW23iDVtWzqBpCs gcpbdm+cmo/tg06s8z1Psuaa+rBlciWmmlFYIk4Pm/yzoa3uE8WxLlnaGSkxHInA/Lyw qVXg== X-Gm-Message-State: AOJu0YyJZ3cbGbD08gxM3mOrU1Gc1d9vDpnJqp375MNxuzSyRsIcdMhp oSXT0PuMeemYxmJxT/TlT8hmuUCcPYYS0cxvDpCGYojM2/65tvJq0+HByuxUhAp8q75s7Ls4ezF PHMaFS4HcCg9gIVr+AbdTILlE+j9fb3LN9M4= X-Google-Smtp-Source: AGHT+IGLB76KQU/J5LAN2mJ2V9pTNVg4AbpwkjiwwaIr1guhnor/1mxk0KyXGVQpm8qKvYiZQIqF0GoMJwMUm8vrxmk= X-Received: by 2002:a2e:9fca:0:b0:2fb:5bd:8ff2 with SMTP id 38308e7fff4ca-2fb82ea1dcdmr18328761fa.16.1729263693109; Fri, 18 Oct 2024 08:01:33 -0700 (PDT) MIME-Version: 1.0 From: Mariam Arutunian Date: Fri, 18 Oct 2024 19:01:21 +0400 Message-ID: Subject: [RFC/RFA][PATCH v5 06/12] aarch64: Implement new expander for efficient CRC computation. To: GCC Patches , Jeff Law X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, HTML_MESSAGE, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch introduces two new expanders for the aarch64 backend, dedicated to generate optimized code for CRC computations. The new expanders are designed to leverage specific hardware capabilities to achieve faster CRC calculations, particularly using the crc32, crc32c and pmull instructions when supported by the target architecture. Expander 1: Bit-Forward CRC (crc4) For targets that support pmul instruction (TARGET_AES), the expander will generate code that uses the pmull (crypto_pmulldi) instruction for CRC computation. Expander 2: Bit-Reversed CRC (crc_rev4) The expander first checks if the target supports the CRC32* instruction set (TARGET_CRC32) and the polynomial in use is 0x1EDC6F41 (iSCSI) or 0x04C11DB7 (HDLC). If the conditions are met, it emits calls to the corresponding crc32* instruction (depending on the data size and the polynomial). If the target does not support crc32* but supports pmull, it then uses the pmull (crypto_pmulldi) instruction for bit-reversed CRC computation. Otherwise table-based CRC is generated. gcc/config/aarch64/ * aarch64-protos.h (aarch64_expand_crc_using_pmull): New extern function declaration. (aarch64_expand_reversed_crc_using_pmull): Likewise. * aarch64.cc (aarch64_expand_crc_using_pmull): New function. (aarch64_expand_reversed_crc_using_pmull): Likewise. * aarch64.md (crc_rev4): New expander for reversed CRC. (crc4): New expander for bit-forward CRC. * iterators.md (crc_data_type): New mode attribute. gcc/testsuite/gcc.target/aarch64/ * crc-1-pmul.c: New test. * crc-10-pmul.c: Likewise. * crc-12-pmul.c: Likewise. * crc-13-pmul.c: Likewise. * crc-14-pmul.c: Likewise. * crc-17-pmul.c: Likewise. * crc-18-pmul.c: Likewise. * crc-21-pmul.c: Likewise. * crc-22-pmul.c: Likewise. * crc-23-pmul.c: Likewise. * crc-4-pmul.c: Likewise. * crc-5-pmul.c: Likewise. * crc-6-pmul.c: Likewise. * crc-7-pmul.c: Likewise. * crc-8-pmul.c: Likewise. * crc-9-pmul.c: Likewise. * crc-CCIT-data16-pmul.c: Likewise. * crc-CCIT-data8-pmul.c: Likewise. * crc-coremark-16bitdata-pmul.c: Likewise. * crc-crc32-data16.c: Likewise. * crc-crc32-data32.c: Likewise. * crc-crc32-data8.c: Likewise. * crc-crc32c-data16.c: Likewise. * crc-crc32c-data32.c: Likewise. * crc-crc32c-data8.c: Likewise. Signed-off-by: Mariam Arutunian Co-authored-by: Richard Sandiford --- gcc/config/aarch64/aarch64-protos.h | 3 + gcc/config/aarch64/aarch64.cc | 131 ++++++++++++++++++ gcc/config/aarch64/aarch64.md | 57 ++++++++ gcc/config/aarch64/iterators.md | 4 + gcc/testsuite/gcc.target/aarch64/crc-1-pmul.c | 8 ++ .../gcc.target/aarch64/crc-10-pmul.c | 8 ++ .../gcc.target/aarch64/crc-12-pmul.c | 9 ++ .../gcc.target/aarch64/crc-13-pmul.c | 8 ++ .../gcc.target/aarch64/crc-14-pmul.c | 8 ++ .../gcc.target/aarch64/crc-17-pmul.c | 8 ++ .../gcc.target/aarch64/crc-18-pmul.c | 8 ++ .../gcc.target/aarch64/crc-21-pmul.c | 8 ++ .../gcc.target/aarch64/crc-22-pmul.c | 8 ++ .../gcc.target/aarch64/crc-23-pmul.c | 8 ++ gcc/testsuite/gcc.target/aarch64/crc-4-pmul.c | 8 ++ gcc/testsuite/gcc.target/aarch64/crc-5-pmul.c | 8 ++ gcc/testsuite/gcc.target/aarch64/crc-6-pmul.c | 8 ++ gcc/testsuite/gcc.target/aarch64/crc-7-pmul.c | 8 ++ gcc/testsuite/gcc.target/aarch64/crc-8-pmul.c | 8 ++ gcc/testsuite/gcc.target/aarch64/crc-9-pmul.c | 8 ++ .../gcc.target/aarch64/crc-CCIT-data16-pmul.c | 9 ++ .../gcc.target/aarch64/crc-CCIT-data8-pmul.c | 9 ++ .../aarch64/crc-coremark-16bitdata-pmul.c | 9 ++ .../gcc.target/aarch64/crc-crc32-data16.c | 53 +++++++ .../gcc.target/aarch64/crc-crc32-data32.c | 52 +++++++ .../gcc.target/aarch64/crc-crc32-data8.c | 53 +++++++ .../gcc.target/aarch64/crc-crc32c-data16.c | 53 +++++++ .../gcc.target/aarch64/crc-crc32c-data32.c | 52 +++++++ .../gcc.target/aarch64/crc-crc32c-data8.c | 53 +++++++ 29 files changed, 667 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-1-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-10-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-12-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-13-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-14-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-17-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-18-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-21-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-22-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-23-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-4-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-5-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-6-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-7-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-8-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-9-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-CCIT-data16-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-CCIT-data8-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-coremark-16bitdata-pmul.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-crc32-data16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-crc32-data32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-crc32-data8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-crc32c-data16.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-crc32c-data32.c create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-crc32c-data8.c diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index d03c1fe798b..7c157073cc6 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -1124,5 +1124,8 @@ extern void aarch64_adjust_reg_alloc_order (); bool aarch64_optimize_mode_switching (aarch64_mode_entity); void aarch64_restore_za (rtx); +void aarch64_expand_crc_using_pmull (scalar_mode, scalar_mode, rtx *); +void aarch64_expand_reversed_crc_using_pmull (scalar_mode, scalar_mode, rtx *); + #endif /* GCC_AARCH64_PROTOS_H */ diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 6a3f1a23a9f..1cc549c5023 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -30386,6 +30386,137 @@ aarch64_retrieve_sysreg (const char *regname, bool write_p, bool is128op) return sysreg->encoding; } +/* Generate assembly to calculate CRC + using carry-less multiplication instruction. + OPERANDS[1] is input CRC, + OPERANDS[2] is data (message), + OPERANDS[3] is the polynomial without the leading 1. */ + +void +aarch64_expand_crc_using_pmull (scalar_mode crc_mode, + scalar_mode data_mode, + rtx *operands) +{ + /* Check and keep arguments. */ + gcc_assert (!CONST_INT_P (operands[0])); + gcc_assert (CONST_INT_P (operands[3])); + rtx crc = operands[1]; + rtx data = operands[2]; + rtx polynomial = operands[3]; + + unsigned HOST_WIDE_INT crc_size = GET_MODE_BITSIZE (crc_mode); + unsigned HOST_WIDE_INT data_size = GET_MODE_BITSIZE (data_mode); + gcc_assert (crc_size <= 32); + gcc_assert (data_size <= crc_size); + + /* Calculate the quotient. */ + unsigned HOST_WIDE_INT + q = gf2n_poly_long_div_quotient (UINTVAL (polynomial), crc_size); + /* CRC calculation's main part. */ + if (crc_size > data_size) + crc = expand_shift (RSHIFT_EXPR, DImode, crc, crc_size - data_size, + NULL_RTX, 1); + + rtx t0 = force_reg (DImode, gen_int_mode (q, DImode)); + polynomial = simplify_gen_unary (ZERO_EXTEND, DImode, polynomial, + GET_MODE (polynomial)); + rtx t1 = force_reg (DImode, polynomial); + + rtx a0 = expand_binop (DImode, xor_optab, crc, data, NULL_RTX, 1, + OPTAB_WIDEN); + + rtx pmull_res = gen_reg_rtx (TImode); + emit_insn (gen_aarch64_crypto_pmulldi (pmull_res, a0, t0)); + a0 = gen_lowpart (DImode, pmull_res); + + a0 = expand_shift (RSHIFT_EXPR, DImode, a0, crc_size, NULL_RTX, 1); + + emit_insn (gen_aarch64_crypto_pmulldi (pmull_res, a0, t1)); + a0 = gen_lowpart (DImode, pmull_res); + + if (crc_size > data_size) + { + rtx crc_part = expand_shift (LSHIFT_EXPR, DImode, operands[1], data_size, + NULL_RTX, 0); + a0 = expand_binop (DImode, xor_optab, a0, crc_part, NULL_RTX, 1, + OPTAB_DIRECT); + } + + aarch64_emit_move (operands[0], gen_lowpart (crc_mode, a0)); +} + +/* Generate assembly to calculate reversed CRC + using carry-less multiplication instruction. + OPERANDS[1] is input CRC, + OPERANDS[2] is data, + OPERANDS[3] is the polynomial without the leading 1. */ + +void +aarch64_expand_reversed_crc_using_pmull (scalar_mode crc_mode, + scalar_mode data_mode, + rtx *operands) +{ + /* Check and keep arguments. */ + gcc_assert (!CONST_INT_P (operands[0])); + gcc_assert (CONST_INT_P (operands[3])); + rtx crc = operands[1]; + rtx data = operands[2]; + rtx polynomial = operands[3]; + + unsigned HOST_WIDE_INT crc_size = GET_MODE_BITSIZE (crc_mode); + unsigned HOST_WIDE_INT data_size = GET_MODE_BITSIZE (data_mode); + gcc_assert (crc_size <= 32); + gcc_assert (data_size <= crc_size); + + /* Calculate the quotient. */ + unsigned HOST_WIDE_INT + q = gf2n_poly_long_div_quotient (UINTVAL (polynomial), crc_size); + /* Reflect the calculated quotient. */ + q = reflect_hwi (q, crc_size + 1); + rtx t0 = force_reg (DImode, gen_int_mode (q, DImode)); + + /* Reflect the polynomial. */ + unsigned HOST_WIDE_INT ref_polynomial = reflect_hwi (UINTVAL (polynomial), + crc_size); + /* An unshifted multiplier would require the final result to be extracted + using a shift right by DATA_SIZE - 1 bits. Shift the multiplier left + so that the shift right can be by CRC_SIZE bits instead. */ + ref_polynomial <<= crc_size - data_size + 1; + rtx t1 = force_reg (DImode, gen_int_mode (ref_polynomial, DImode)); + + /* CRC calculation's main part. */ + rtx a0 = expand_binop (DImode, xor_optab, crc, data, NULL_RTX, 1, + OPTAB_WIDEN); + + /* Perform carry-less multiplication and get low part. */ + rtx pmull_res = gen_reg_rtx (TImode); + emit_insn (gen_aarch64_crypto_pmulldi (pmull_res, a0, t0)); + a0 = gen_lowpart (DImode, pmull_res); + + a0 = expand_binop (DImode, and_optab, a0, + gen_int_mode (GET_MODE_MASK (data_mode), DImode), + NULL_RTX, 1, OPTAB_WIDEN); + + /* Perform carry-less multiplication. */ + emit_insn (gen_aarch64_crypto_pmulldi (pmull_res, a0, t1)); + + /* Perform a shift right by CRC_SIZE as an extraction of lane 1. */ + machine_mode crc_vmode = aarch64_vq_mode (crc_mode).require (); + a0 = (crc_size > data_size ? gen_reg_rtx (crc_mode) : operands[0]); + emit_insn (gen_aarch64_get_lane (crc_vmode, a0, + gen_lowpart (crc_vmode, pmull_res), + aarch64_endian_lane_rtx (crc_vmode, 1))); + + if (crc_size > data_size) + { + rtx crc_part = expand_shift (RSHIFT_EXPR, crc_mode, crc, data_size, + NULL_RTX, 1); + a0 = expand_binop (crc_mode, xor_optab, a0, crc_part, operands[0], 1, + OPTAB_WIDEN); + aarch64_emit_move (operands[0], a0); + } +} + /* Target-specific selftests. */ #if CHECKING_P diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index c54b29cd64b..d390d45f77f 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -4566,6 +4566,63 @@ [(set_attr "type" "crc")] ) +;; Reversed CRC +(define_expand "crc_rev4" + [;; return value (calculated CRC) + (match_operand:ALLX 0 "register_operand" "=r") + ;; initial CRC + (match_operand:ALLX 1 "register_operand" "r") + ;; data + (match_operand:ALLI 2 "register_operand" "r") + ;; polynomial without leading 1 + (match_operand:ALLX 3)] + "" + { + /* If the polynomial is the same as the polynomial of crc32c* instruction, + put that instruction. crc32c uses iSCSI polynomial. */ + if (TARGET_CRC32 && INTVAL (operands[3]) == 0x1EDC6F41 + && mode == SImode) + emit_insn (gen_aarch64_crc32c (operands[0], + operands[1], + operands[2])); + /* If the polynomial is the same as the polynomial of crc32* instruction, + put that instruction. crc32 uses HDLC etc. polynomial. */ + else if (TARGET_CRC32 && INTVAL (operands[3]) == 0x04C11DB7 + && mode == SImode) + emit_insn (gen_aarch64_crc32 (operands[0], + operands[1], + operands[2])); + else if (TARGET_AES && <= ) + aarch64_expand_reversed_crc_using_pmull (mode, + mode, + operands); + else + /* Otherwise, generate table-based CRC. */ + expand_reversed_crc_table_based (operands[0], operands[1], operands[2], + operands[3], mode, + generate_reflecting_code_standard); + DONE; + } +) + +;; Bit-forward CRC +(define_expand "crc4" + [;; return value (calculated CRC) + (match_operand:ALLX 0 "register_operand" "=r") + ;; initial CRC + (match_operand:ALLX 1 "register_operand" "r") + ;; data + (match_operand:ALLI 2 "register_operand" "r") + ;; polynomial without leading 1 + (match_operand:ALLX 3)] + "TARGET_AES && <= " + { + aarch64_expand_crc_using_pmull (mode, mode, + operands); + DONE; + } +) + (define_insn "*csinc2_insn" [(set (match_operand:GPI 0 "register_operand" "=r") (plus:GPI (match_operand 2 "aarch64_comparison_operation" "") diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 20a318e023b..9c439c45dd3 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -1280,6 +1280,10 @@ ;; Map a mode to a specific constraint character. (define_mode_attr cmode [(QI "q") (HI "h") (SI "s") (DI "d")]) +;; Map a mode to a specific constraint character for calling +;; appropriate version of crc. +(define_mode_attr crc_data_type [(QI "b") (HI "h") (SI "w") (DI "x")]) + ;; Map modes to Usg and Usj constraints for SISD right shifts (define_mode_attr cmode_simd [(SI "g") (DI "j")]) diff --git a/gcc/testsuite/gcc.target/aarch64/crc-1-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-1-pmul.c new file mode 100644 index 00000000000..4043251dbd8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-1-pmul.c @@ -0,0 +1,8 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc -fdisable-tree-phiopt2 -fdisable-tree-phiopt3" } */ + +#include "../../gcc.dg/torture/crc-1.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/crc-10-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-10-pmul.c new file mode 100644 index 00000000000..0078eebe35c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-10-pmul.c @@ -0,0 +1,8 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ + +#include "../../gcc.dg/torture/crc-10.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-12-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-12-pmul.c new file mode 100644 index 00000000000..16d901eeaef --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-12-pmul.c @@ -0,0 +1,9 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc -fdisable-tree-phiopt2 -fdisable-tree-phiopt3" } */ +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ + +#include "../../gcc.dg/torture/crc-12.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-13-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-13-pmul.c new file mode 100644 index 00000000000..bd8f32e6924 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-13-pmul.c @@ -0,0 +1,8 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ + +#include "../../gcc.dg/torture/crc-13.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-14-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-14-pmul.c new file mode 100644 index 00000000000..d35c1110c89 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-14-pmul.c @@ -0,0 +1,8 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ + +#include "../../gcc.dg/torture/crc-14.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-17-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-17-pmul.c new file mode 100644 index 00000000000..99b84c8dde0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-17-pmul.c @@ -0,0 +1,8 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ + +#include "../../gcc.dg/torture/crc-17.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-18-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-18-pmul.c new file mode 100644 index 00000000000..888c99a7dd7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-18-pmul.c @@ -0,0 +1,8 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ + +#include "../../gcc.dg/torture/crc-18.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-21-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-21-pmul.c new file mode 100644 index 00000000000..4b92deceaac --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-21-pmul.c @@ -0,0 +1,8 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ + +#include "../../gcc.dg/torture/crc-21.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-22-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-22-pmul.c new file mode 100644 index 00000000000..b42b8525b24 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-22-pmul.c @@ -0,0 +1,8 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ + +#include "../../gcc.dg/torture/crc-22.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-23-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-23-pmul.c new file mode 100644 index 00000000000..eb2efae0c41 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-23-pmul.c @@ -0,0 +1,8 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ + +#include "../../gcc.dg/torture/crc-23.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-4-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-4-pmul.c new file mode 100644 index 00000000000..c7d50017fe8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-4-pmul.c @@ -0,0 +1,8 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ + +#include "../../gcc.dg/torture/crc-4.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-5-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-5-pmul.c new file mode 100644 index 00000000000..2a4b87cc5d6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-5-pmul.c @@ -0,0 +1,8 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crypto -O2 -w -fdump-rtl-dfinish -fdump-tree-crc" } */ + +#include "../../gcc.dg/torture/crc-5.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/crc-6-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-6-pmul.c new file mode 100644 index 00000000000..84604af525a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-6-pmul.c @@ -0,0 +1,8 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ + +#include "../../gcc.dg/torture/crc-6.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/crc-7-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-7-pmul.c new file mode 100644 index 00000000000..e1263fca91d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-7-pmul.c @@ -0,0 +1,8 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ + +#include "../../gcc.dg/torture/crc-7.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-8-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-8-pmul.c new file mode 100644 index 00000000000..141b474578b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-8-pmul.c @@ -0,0 +1,8 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ + +#include "../../gcc.dg/torture/crc-8.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-9-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-9-pmul.c new file mode 100644 index 00000000000..2fdcd425a3b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-9-pmul.c @@ -0,0 +1,8 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ + +#include "../../gcc.dg/torture/crc-9.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data16-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data16-pmul.c new file mode 100644 index 00000000000..21520474564 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data16-pmul.c @@ -0,0 +1,9 @@ +/* { dg-do run } */ +/* { dg-options "-w -march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ + +#include "../../gcc.dg/torture/crc-CCIT-data16.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data8-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data8-pmul.c new file mode 100644 index 00000000000..3dcc92320f3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-CCIT-data8-pmul.c @@ -0,0 +1,9 @@ +/* { dg-do run } */ +/* { dg-options "-w -march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } } */ + +#include "../../gcc.dg/torture/crc-CCIT-data8.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/crc-coremark-16bitdata-pmul.c b/gcc/testsuite/gcc.target/aarch64/crc-coremark-16bitdata-pmul.c new file mode 100644 index 00000000000..e5196aaafef --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-coremark-16bitdata-pmul.c @@ -0,0 +1,9 @@ +/* { dg-do run } */ +/* { dg-options "-w -march=armv8-a+crypto -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ + +#include "../../gcc.dg/torture/crc-coremark16-data16.c" + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "pmull" "dfinish"} } */ \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32-data16.c b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data16.c new file mode 100644 index 00000000000..e82cb04fcc3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data16.c @@ -0,0 +1,53 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ + +#include +#include + +__attribute__ ((noinline,optimize(0))) +uint32_t _crc32_O0 (uint32_t crc, uint16_t data) { + int i; + crc = crc ^ data; + + for (i = 0; i < 8; i++) { + if (crc & 1) + crc = (crc >> 1) ^ 0xEDB88320; + else + crc = (crc >> 1); + } + + return crc; +} + +uint32_t _crc32 (uint32_t crc, uint16_t data) { + int i; + crc = crc ^ data; + + for (i = 0; i < 8; i++) { + if (crc & 1) + crc = (crc >> 1) ^ 0xEDB88320; + else + crc = (crc >> 1); + } + + return crc; +} + +int main () +{ + uint32_t crc = 0x0D800D80; + for (uint16_t i = 0; i < 0xffff; i++) + { + uint32_t res1 = _crc32_O0 (crc, i); + uint32_t res2 = _crc32 (crc, i); + if (res1 != res2) + abort (); + crc = res1; + } +} + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32" "dfinish"} } */ +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32-data32.c b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data32.c new file mode 100644 index 00000000000..a7564a7e28a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data32.c @@ -0,0 +1,52 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ + +#include +#include +__attribute__ ((noinline,optimize(0))) +uint32_t _crc32_O0 (uint32_t crc, uint32_t data) { + int i; + crc = crc ^ data; + + for (i = 0; i < 32; i++) { + if (crc & 1) + crc = (crc >> 1) ^ 0xEDB88320; + else + crc = (crc >> 1); + } + + return crc; +} + +uint32_t _crc32 (uint32_t crc, uint32_t data) { + int i; + crc = crc ^ data; + + for (i = 0; i < 32; i++) { + if (crc & 1) + crc = (crc >> 1) ^ 0xEDB88320; + else + crc = (crc >> 1); + } + + return crc; +} + +int main () +{ + uint32_t crc = 0x0D800D80; + for (uint8_t i = 0; i < 0xff; i++) + { + uint32_t res1 = _crc32_O0 (crc, i); + uint32_t res2 = _crc32 (crc, i); + if (res1 != res2) + abort (); + crc = res1; + } +} + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32" "dfinish"} } */ +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32-data8.c b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data8.c new file mode 100644 index 00000000000..c88cafadedc --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32-data8.c @@ -0,0 +1,53 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ + +#include +#include + +__attribute__ ((noinline,optimize(0))) +uint32_t _crc32_O0 (uint32_t crc, uint8_t data) { + int i; + crc = crc ^ data; + + for (i = 0; i < 8; i++) { + if (crc & 1) + crc = (crc >> 1) ^ 0xEDB88320; + else + crc = (crc >> 1); + } + + return crc; +} + +uint32_t _crc32 (uint32_t crc, uint8_t data) { + int i; + crc = crc ^ data; + + for (i = 0; i < 8; i++) { + if (crc & 1) + crc = (crc >> 1) ^ 0xEDB88320; + else + crc = (crc >> 1); + } + + return crc; +} + +int main () +{ + uint32_t crc = 0x0D800D80; + for (uint8_t i = 0; i < 0xff; i++) + { + uint32_t res1 = _crc32_O0 (crc, i); + uint32_t res2 = _crc32 (crc, i); + if (res1 != res2) + abort (); + crc = res1; + } +} + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32" "dfinish"} } */ +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data16.c b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data16.c new file mode 100644 index 00000000000..d82e6252603 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data16.c @@ -0,0 +1,53 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ + +#include +#include + +__attribute__ ((noinline,optimize(0))) +uint32_t _crc32_O0 (uint32_t crc, uint16_t data) { + int i; + crc = crc ^ data; + + for (i = 0; i < 8; i++) { + if (crc & 1) + crc = (crc >> 1) ^ 0x82F63B78; + else + crc = (crc >> 1); + } + + return crc; +} + +uint32_t _crc32 (uint32_t crc, uint16_t data) { + int i; + crc = crc ^ data; + + for (i = 0; i < 8; i++) { + if (crc & 1) + crc = (crc >> 1) ^ 0x82F63B78; + else + crc = (crc >> 1); + } + + return crc; +} + +int main () +{ + uint32_t crc = 0x0D800D80; + for (uint16_t i = 0; i < 0xffff; i++) + { + uint32_t res1 = _crc32_O0 (crc, i); + uint32_t res2 = _crc32 (crc, i); + if (res1 != res2) + abort (); + crc = res1; + } +} + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32C" "dfinish"} } */ +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data32.c b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data32.c new file mode 100644 index 00000000000..7acb6fc239c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data32.c @@ -0,0 +1,52 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ + +#include +#include +__attribute__ ((noinline,optimize(0))) +uint32_t _crc32_O0 (uint32_t crc, uint32_t data) { + int i; + crc = crc ^ data; + + for (i = 0; i < 32; i++) { + if (crc & 1) + crc = (crc >> 1) ^ 0x82F63B78; + else + crc = (crc >> 1); + } + + return crc; +} + +uint32_t _crc32 (uint32_t crc, uint32_t data) { + int i; + crc = crc ^ data; + + for (i = 0; i < 32; i++) { + if (crc & 1) + crc = (crc >> 1) ^ 0x82F63B78; + else + crc = (crc >> 1); + } + + return crc; +} + +int main () +{ + uint32_t crc = 0x0D800D80; + for (uint8_t i = 0; i < 0xff; i++) + { + uint32_t res1 = _crc32_O0 (crc, i); + uint32_t res2 = _crc32 (crc, i); + if (res1 != res2) + abort (); + crc = res1; + } +} + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32C" "dfinish"} } */ +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data8.c b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data8.c new file mode 100644 index 00000000000..e8a8901e453 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/crc-crc32c-data8.c @@ -0,0 +1,53 @@ +/* { dg-do run } */ +/* { dg-options "-march=armv8-a+crc -O2 -fdump-rtl-dfinish -fdump-tree-crc" } */ +/* { dg-skip-if "" { *-*-* } { "-flto"} } */ + +#include +#include + +__attribute__ ((noinline,optimize(0))) +uint32_t _crc32_O0 (uint32_t crc, uint8_t data) { + int i; + crc = crc ^ data; + + for (i = 0; i < 8; i++) { + if (crc & 1) + crc = (crc >> 1) ^ 0x82F63B78; + else + crc = (crc >> 1); + } + + return crc; +} + +uint32_t _crc32 (uint32_t crc, uint8_t data) { + int i; + crc = crc ^ data; + + for (i = 0; i < 8; i++) { + if (crc & 1) + crc = (crc >> 1) ^ 0x82F63B78; + else + crc = (crc >> 1); + } + + return crc; +} + +int main () +{ + uint32_t crc = 0x0D800D80; + for (uint8_t i = 0; i < 0xff; i++) + { + uint32_t res1 = _crc32_O0 (crc, i); + uint32_t res2 = _crc32 (crc, i); + if (res1 != res2) + abort (); + crc = res1; + } +} + +/* { dg-final { scan-tree-dump "calculates CRC!" "crc"} } */ +/* { dg-final { scan-tree-dump-times "Couldn't generate faster CRC code." 0 "crc"} } */ +/* { dg-final { scan-rtl-dump "UNSPEC_CRC32C" "dfinish"} } */ +/* { dg-final { scan-rtl-dump-times "pmull" 0 "dfinish"} } */ -- 2.25.1