From patchwork Thu Mar 7 22:05:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thibault Ferrante X-Patchwork-Id: 1909476 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=185.125.189.65; helo=lists.ubuntu.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=patchwork.ozlabs.org) Received: from lists.ubuntu.com (lists.ubuntu.com [185.125.189.65]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TrNfz74N2z1yWy for ; Fri, 8 Mar 2024 09:06:19 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=lists.ubuntu.com) by lists.ubuntu.com with esmtp (Exim 4.86_2) (envelope-from ) id 1riLsN-0007hC-AV; Thu, 07 Mar 2024 22:06:07 +0000 Received: from smtp-relay-canonical-0.internal ([10.131.114.83] helo=smtp-relay-canonical-0.canonical.com) by lists.ubuntu.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1riLsL-0007gw-Er for kernel-team@lists.ubuntu.com; Thu, 07 Mar 2024 22:06:05 +0000 Received: from Q58-sff.fritz.box (unknown [45.83.232.52]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-canonical-0.canonical.com (Postfix) with ESMTPSA id ED5543F201; Thu, 7 Mar 2024 22:06:04 +0000 (UTC) From: Thibault Ferrante To: kernel-team@lists.ubuntu.com Subject: [N/U] [PATCH 0/13] crypto: qat - improve recovery flows Date: Thu, 7 Mar 2024 23:05:38 +0100 Message-ID: <20240307220551.3529171-1-thibault.ferrante@canonical.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" BugLink: https://bugs.launchpad.net/bugs/2056354 [Impact] This set improves the error recovery flows in the QAT drivers and adds a mechanism to test it through an heartbeat simulator. This is an upstream patch set applied to linux-next and scheduled for 6.9. Link to the upstream submission: https://patchwork.kernel.org/project/linux-crypto/cover/20240202105324.50391-1-mun.chun.yep@intel.com/ We should apply this set to the Noble 6.8 kernel, in order to experience less issues with qat and improve maintainability. An added commit is required to update the configuration. [Test case] Unload and reload the module to verify that qat recover and log issues properly. Use the added error injection mechanism to verify the recovery flow. [Fix] Apply the following commits (from linux-next): 2ecd43413d76 Documentation: qat: fix auto_reset section 7d42e097607c crypto: qat - resolve race condition during AER recovery c2304e1a0b80 crypto: qat - change SLAs cleanup flow at shutdown 9567d3dc7609 crypto: qat - improve aer error reset handling 750fa7c20e60 crypto: qat - limit heartbeat notifications f5419a4239af crypto: qat - add auto reset on error 2aaa1995a94a crypto: qat - add fatal error notification 4469f9b23468 crypto: qat - re-enable sriov after pf reset ec26f8e6c784 crypto: qat - update PFVF protocol for recovery 758a0087db98 crypto: qat - disable arbitration before reset ae508d7afb75 crypto: qat - add fatal error notify method e2b67859ab6e crypto: qat - add heartbeat error simulator [Regression potential] We may experience qat regression when crashing or restarting the module. Damian Muszynski (4): crypto: qat - add heartbeat error simulator crypto: qat - add auto reset on error crypto: qat - change SLAs cleanup flow at shutdown crypto: qat - resolve race condition during AER recovery Furong Zhou (3): crypto: qat - add fatal error notify method crypto: qat - disable arbitration before reset crypto: qat - limit heartbeat notifications Giovanni Cabiddu (1): Documentation: qat: fix auto_reset section Mun Chun Yep (4): crypto: qat - update PFVF protocol for recovery crypto: qat - re-enable sriov after pf reset crypto: qat - add fatal error notification crypto: qat - improve aer error reset handling Thibault Ferrante (1): UBUNTU: [Config] Disable CONFIG_CRYPTO_DEV_QAT_ERROR_INJECTION Documentation/ABI/testing/debugfs-driver-qat | 26 ++++ Documentation/ABI/testing/sysfs-driver-qat | 20 +++ debian.master/config/annotations | 1 + drivers/crypto/intel/qat/Kconfig | 14 ++ drivers/crypto/intel/qat/qat_common/Makefile | 2 + .../intel/qat/qat_common/adf_accel_devices.h | 2 + drivers/crypto/intel/qat/qat_common/adf_aer.c | 138 +++++++++++++++++- .../intel/qat/qat_common/adf_cfg_strings.h | 1 + .../intel/qat/qat_common/adf_common_drv.h | 10 ++ .../intel/qat/qat_common/adf_heartbeat.c | 20 ++- .../intel/qat/qat_common/adf_heartbeat.h | 21 +++ .../qat/qat_common/adf_heartbeat_dbgfs.c | 52 +++++++ .../qat/qat_common/adf_heartbeat_inject.c | 76 ++++++++++ .../intel/qat/qat_common/adf_hw_arbiter.c | 25 ++++ .../crypto/intel/qat/qat_common/adf_init.c | 12 ++ drivers/crypto/intel/qat/qat_common/adf_isr.c | 7 +- .../intel/qat/qat_common/adf_pfvf_msg.h | 7 +- .../intel/qat/qat_common/adf_pfvf_pf_msg.c | 64 +++++++- .../intel/qat/qat_common/adf_pfvf_pf_msg.h | 21 +++ .../intel/qat/qat_common/adf_pfvf_pf_proto.c | 8 + .../intel/qat/qat_common/adf_pfvf_vf_proto.c | 6 + drivers/crypto/intel/qat/qat_common/adf_rl.c | 20 ++- .../crypto/intel/qat/qat_common/adf_sriov.c | 38 ++++- .../crypto/intel/qat/qat_common/adf_sysfs.c | 37 +++++ 24 files changed, 607 insertions(+), 21 deletions(-) create mode 100644 drivers/crypto/intel/qat/qat_common/adf_heartbeat_inject.c